OP-LLCJ150036 1..22 The value of critical destruction: Evaluating multispectral image processing methods for the analysis of primary historical texts ............................................................................................................................................................ Alejandro Giacometti Department of Medical Physics and Biomedical Engineering, UCL Centre for Digital Humanities, University College London, London Alberto Campagnolo Ligatus Research Centre, CCW Graduate School, University of the Arts London, London Lindsay MacDonald Photogrammetry, 3D Imaging and Metrology Research Centre, University College London, London Simon Mahony UCL Centre for Digital Humanities, Department of Information Studies, University College London, London Stuart Robson Photogrammetry, 3D Imaging and Metrology Research Centre, University College London, London Tim Weyrich Department of Computer Science, UCL Centre for Digital Humanities, University College London, London Melissa Terras Department of Information Studies, UCL Centre for Digital Humanities, University College London, London Adam Gibson Department of Medical Physics and Biomedical Engineering, University College London, London ....................................................................................................................................... Abstract Multispectral imaging—a method for acquiring image data over a series of wave- lengths across the light spectrum—is becoming a valuable tool within the cultural Correspondence: Melissa Terras, Department of Information Studies, Foster Court, University College London, Gower Street, WC1E 6BT, London. E-mail: m.terras@ucl.ac.uk Digital Scholarship in the Humanities � The Author 2015. Published by Oxford University Press on behalf of EADH. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 1 of 22 doi:10.1093/llc/fqv036 Digital Scholarship in the Humanities Advance Access published October 7, 2015 XPath error Undefined namespace prefix and heritage sector for the recovery and enhancement of information contained within primary historical texts. However, most applications of this technique, to date, have been bespoke: analysing particular documents of historic importance. There has been little prior work done on evaluating this technique in a structured fashion, to provide recommendations on how best to capture and process images when working with damaged and abraded textual material. This article intro- duces a new approach for evaluating the efficacy of image processing algorithms in recovering information from multispectral images of deteriorated primary historical texts. We present a series of experiments that deliberately degrade samples cut from a real historical document to provide a set of images acquired before and after damage. These images then allow us to compare, both objectively and quantitatively, the effectiveness of multispectral imaging and image process- ing for recovering information from damaged text. We develop a methodological framework for the continuing study of the techniques involved in the analysis and processing of multispectral images of primary historical texts, and a dataset which will be of use to others interested in advanced digitisation techniques within the cultural heritage sector. ................................................................................................................................................................................. 1 Introduction Multispectral imaging is an advanced digitisation method for acquiring image data over a series of wavelengths across the light spectrum. Combined with image processing, it has become a valuable tool for the enhancement and recovery of informa- tion contained within culturally important docu- ments, providing a means, in some cases, to recover lost text, or examine other features no longer detectable by the human eye. However, applications of multispectral imaging within the cultural and heritage sector have mainly been bespoke, with limited access to or understanding of the techniques and methods used to recover damaged text. The barriers to accessing this technol- ogy will become lower as the equipment becomes commercially available; however, it is important that we better understand the methods and approaches used for multispectral imaging in order to be able to use such techniques efficiently, whilst maximising the information we can recover from cultural objects. This article describes a highly interdisciplinary approach to evaluating multispectral imaging and image processing in the context of primary histor- ical sources. We introduce a formal methodology to evaluate image processing of multispectral data and provide a framework for developing new, best prac- tice methods when using multispectral processes to image damaged texts. We do so by first building up a large dataset of multispectral images of actual parchment, taken before and after a set of degrad- ation procedures that were designed to match the most likely types of damage which may occur over the lifetime of parchment documents. This dataset then allows us to evaluate the efficacy of image pro- cessing algorithms attempting to recover damaged text, and to make recommendations on how best to apply multispectral imaging when attempting to recover information from damaged text. Our novel approach, which requires the necessary, con- trolled destruction of a historical parchment docu- ment, presents a formal methodology in acquiring, processing, and analysing multispectral data. It also led to the creation of a large dataset consisting of a series of multispectral images showing both the initial and degraded state of samples from a real manuscript, providing a valuable tool for the advanced digitisation research community. As such, this article makes a major contribution to our understanding of how multispectral imaging can be used across the cultural and heritage sector, and demonstrates how an interdisciplinary approach centred on questions raised from within a Digital Humanities project can advance our A. Giacometti et al. 2 of 22 Digital Scholarship in the Humanities, 2015 understanding of image processing for both the cultural heritage and engineering science sectors. 2 The Digital Humanities and Imaging Although most effort in the Digital Humanities is focussed on the production, analysis, and visualisa- tion of text1, there is a recent and growing interest in the community towards digital imaging, and how image capture and processing techniques can aid us in uncovering new bodies of information, particu- larly from historical documents2. Digital imaging technology has been used to produce detailed and trustworthy surrogates of historical documents for decades (Deegan and Tanner, 2002; Hughes, 2004; Terras, 2008), and digitized versions of primary his- torical sources are often adequate for the needs of most scholars. However, improvements in image processing and analysis have led to a number of exciting and important digital humanities projects which can reveal a greater wealth of information about the originals, beyond traditional digitisation technologies. Leveraged by technological improve- ments in image acquisition and image processing, humanities scholars have been able to image, ana- lyse, and recover more information from historical texts (Chabries et al., 2003; Terras, 2006a; Salerno et al., 2007; Tanner and Bearman, 2009). One of the most promising techniques3 is multispectral ima- ging, which can provide additional evidence of the content of a document when it is difficult to read with the naked eye, when further information about the physical composition of a document and ink identification is required (Senvaitenë et al., 2005), or when information is required about its proven- ance (Tanner and Bearman 2009). 3 Multispectral Imaging Light is an electromagnetic wave, often characterised by its wavelength (which we perceive as colour), which is the distance between two consecutive peaks of the wave. The spectrum that is visible to humans includes wavelengths from approximately 380 nm to 760 nm (Fig. 1). Light with a wavelength longer than 760 nm is referred to as infrared; ultra- violet is light with wavelengths shorter than 380 nm (Peatross and Ware, 2013). Most digital imaging equipment captures the same broad spectra of light that is visible to humans with a combination of broadband red, green, and blue sensors (this is hardly surprising, given that the outputs of most imaging technologies are those which humans should be able to see). In contrast, multispectral ima- ging measures a series of discrete wavelengths over a defined range. These images can be acquired in the visible spectrum and also in the infrared and ultra- violet spectrum (with images that include a broader range of wavelengths often being referred to as hyper- spectral (Landgrebe, 1999)). Multispectral images are reasonably straightfor- ward to acquire if appropriate light sources and detectors are available along with a method for wavelength selection. A spectrum of light is usually selected through the use of filters, or via a light source4. For example, a series of filters placed in front of a camera lens can allow images to be cap- tured in distinct wavebands (Hardeberg et al., 2002; Attas, 2004; Rapantzikos and Balas, 2005) or, in a more recent development, light sources can be used which emit at specific wavelengths (Easton et al., 2010; Marengo et al., 2011; Hollaus et al., 2013). Images may then be acquired using a commercial camera or a more sophisticated scanning system. The resulting sets of images can show different aspects of a document at different wavelengths (Fig. 2). Multispectral imaging was first developed by NASA in the 1950s to determine the composition of objects in space (Landgrebe, 1999) and more recently has been used in medical imaging, for example in imaging the interior of the eye (Everdell et al., 2009). It is also incredibly useful to help in reading documents: given that different inks have different spectral signatures due to their differing chemical composition, multispectral ima- ging can be used to differentiate inks used in differ- ent areas on a document, different depths within a document (such as in the case of palimpsests) or to differentiate ink from other types of document damage, such as mould or abrasion. In the cultural Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 3 of 22 and heritage sectors, multispectral imaging has been used across a range of documents including the Archimedes Palimpsest (Salerno et al., 2007), the Dead Sea Scrolls (Chabries et al., 2003; Tanner and Bearman, 2009), carbonised scrolls from Herculaneum (Chabries et al., 2003), letters from the Hudson Bay Archives (Goltz et al., 2007), pal- impsests from the Saint Catherine Monastery in Egypt (Easton et al., 2010), improving tarnished daguerreotypes (Goltz and Hill, 2012), removing ef- fects of ink-bleeding, ink-corrosion, and foxing (Joo Kim et al., 2011), and recovering the diaries of David Livingstone (Knox et al., 2011). Although multispectral imaging is currently the leading tech- nique for recovering lost text in historical manu- scripts, there are no guidelines which determine when it is the most appropriate technique compared to imaging at a single wavelength, or what the best wavelengths to use are. One purpose of this work is to establish a means to compare different imaging approaches objectively so that such guidelines can be evidence-based. Most of the reported applications of this tech- nique in the cultural and heritage sector are to spe- cific documents of great historical importance. Wider use of the technology is now inevitable as more examples of successful recovery from multi- spectral images of historical documents arise, although a careful cost–benefit analysis is required to consider the type of data that a multispectral imaging project might yield, given the present (but falling) costs of undertaking this kind of ima- ging. As the availability of the technique is expected to increase, it is important to consider best practice5 in capturing and processing multispectral images. Questions remain as to how best to take advantage of digital visualisation technology to present multi- spectral image data of cultural heritage to historians and palaeographers (Bonanni et al., 2009, Ponto et al., 2009). In addition, there is little evidence avail- able about how best to process or analyse multispec- tral images (Giacometti, 2013). Further processing (the computational manipulation of digital images, see Gonzalez and Woods (1993) for an introduc- tion) of multispectral images can allow important historical features and details to be identified, enhanced, and separated from other features, and it is important to understand what image processing approaches are most useful when dealing specifically with multispectral images of particular types of damage found on primary historical texts. Additionally, multispectral imaging can be misun- derstood, with the technology sometimes being described as if it were magic (for example, see Zolfagharifard, 2014), and there is a need for a sys- tematic investigation into the effectiveness and Fig. 1 Multispectral images are captured in a similar process to colour images, but with many images captured at discrete narrow ranges of the light spectrum, rather than a small number of images which are each sensitive to light at a broader range of wavelengths19 A. Giacometti et al. 4 of 22 Digital Scholarship in the Humanities, 2015 usefulness of the technique for the cultural and heri- tage sector. Previous multispectral imaging capture projects, applied to specific examples of texts of historical importance, have concentrated on recording docu- ments in their current state (generally once import- ant features are illegible). Here, we investigate best practice in the multispectral imaging of heritage material by imaging a parchment document before and after a series of degradation processes, allowing us to assess the effectiveness of image processing algorithms to recover information from degraded documents. This gives us a unique platform for evaluating the quality of recovered images, and allows us to assess the performance of image pro- cessing algorithms for analysis of these images. We propose a method for objectively comparing images of degraded documents, and develop a method for indicating which image processing methods are most appropriate for recovering text which has suffered from specific types of damage. In addition, at a time when ‘critical making’6 is being much dis- cussed in the Digital Humanities, we propose that our approach to ‘critical destruction’ demonstrates the importance of adopting quantitative approaches when undertaking Digital Humanities research. 4 Method The evaluation of image quality and the perform- ance of methods which produce images of cultural heritage documents is a complex and challenging task (MacDonald and Jacobsen, 2006), and there has been little attempt to evaluate multispectral image quality previously. Partly this is because ‘quality’ is ill-defined. Here, we are able to introduce a new, objective definition of ’quality’, namely, the amount of shared information between an image of the undamaged parchment and one of the recovered text. This is explored more fully in section 4.4. Existing multispectral image data are often particu- lar to an individual document, and the success or failure of analysis is determined by the subjectively perceived legibility of the writing (Easton et al, 2003; Attas, 2004; Knox, 2008). In order to assess multi- spectral image processing methods objectively, it is necessary to acquire data under controlled condi- tions: capturing multispectral images of a manu- script before any degradation occurs, and then capturing images of the manuscript after degrad- ation. These two sets of images enable evaluation of the image processing methods and their perform- ance on a real degraded document7. Naturally, this is impossible to do with historical text which has already been degraded (and no curator would allow us to degrade a primary historical text of any importance), but it is possible to adopt an experi- mental approach in which a real manuscript is deliberately degraded in a rigorously controlled fashion, and its corresponding deterioration docu- mented via multispectral imaging. This allows us to quantitatively and objectively compare the recovery of text from the degraded manuscript and to Fig. 2 Multispectral detail of a single feature from our sample 0602R captured using a monochrome camera. Note the variation in intensity and contrast of the writing and ink across the imaged wavelengths. It can be observed how, initially, the ink gains contrast slightly, with a darker background in the shorter wavelengths. Around the longer wavelengths in the visible spectrum and into the near-infrared, the contrast suddenly drops quite significantly, ren- dering the contrast in the 900 nm image almost null Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 5 of 22 evaluate the success of multispectral imaging and its related processing approaches when faced with spe- cific documentary damage. Such an experimental approach is common in the field of conservation: the changes that parch- ment (or more specifically, its collagen fibres) undergo have been studied when subjecting samples to heat (Chahine, 2000), ultraviolet light (Meghea et al., 2004) even open flame (Giurginca et al., 2009) and chemical solutions (Dobrusina and Visotskite, 1994). Other methods, such as optical coherence tomography (Góra et al., 2006), x-ray fluorescence, optical fluorescence (Dolgin et al., 2007), and x-ray diffraction (Kennedy and Wess, 2006), have been used to identify the state of degradation of parch- ment: however, ours is the first known study to focus on the macroscopic effects of degradation agents on the legibility of primary historical texts in order to understand how we can most effectively use multispectral methods. 4.1 Degraded and Degrading Text We chose to focus on parchment documents for our study, given that parchment remains the primary medium of large quantities of culturally important documents in archives, museums, libraries, and pri- vate collections. A durable, stiff material, parchment is made of animal skin and consists of structured collagen fibres, but is highly sensitive to changes in humidity, and is endangered by biological, thermo- chemical, and mechanical damage (Reed, 1972; Clarkson, 1992; Larsen, 2007). Through consult- ation with conservators and archivists, we identified twenty methods of degradation that commonly affect historical parchment material, changing its physical characteristics at both microscopic and macroscopic levels (Vnouček, 2007). These included both physical and chemical agents to mimic the kinds of damage that parchment documents can be expected to incur during their lives, from techno- logical mistakes during production, to improper use, unsuitable storage conditions, disasters, and natural ageing (Table 1). The damage agents were selected so as to affect not only the properties of the parchment, but also the legibility of text in various ways, for example, shrinking or otherwise deform- ing the parchment, and obscuring or effacing the writing via physical, chemical, or biological reac- tions and stains (Giacometti et al., 2012). 4.2 The Parchment Prepared An eighteenth-century manuscript was donated for our experiment from London Metropolitan Archives8. The document was an assignment of property which had been deemed to hold no histor- ical or scholarly value, and had been de-accessioned from their collection prior to our request for parch- ment material in accordance with The National Archives guidance on deaccessioning and disposal (The National Archives, 2015). The manuscript was written in iron gall ink on prepared animal skin, measured approximately 70�70 cm, and was composed of two large leaves (Fig. 3), which were folded in thirds both horizontally and vertically. Both leaves had red margin guidelines and a blue seal glued outside the left margin. The outer leaf contained a large stamp on the top left corner and a fold trapping the inner leaf with red wax. Both leaves had writing on the recto covering most of the area of the leaf enclosed by the red margin. The outer leaf had a written section on its verso, detailing the date and contents of the document. The recto of the manuscript corresponded with the flesh side of the parchment, the verso with the hair side. Apart from some signs of wear and tear, especially around the folds of the text, it was in overall good condition. Twenty-three 8�8 cm flat square sections were cut from the parchment as samples for this research: each sample contains writ- ten text. Each of the 23 samples was selected from a flat area of the manuscript, where the writing covered the surface of the recto (flesh side), and the verso was empty and without blemishes. Folds and marks of any kind were avoided. There were three excep- tions: two of the samples were cut so as to contain writing on both the recto and verso, and one sample was cut from a folded area. The old fold sample and one of the samples with writing on both sides were kept as controls. The samples were imaged, then a series of treatments were applied to damage the twenty samples, as described in Table 1; three were left untreated and kept intact as controls (giving a total of 23 samples). The samples were then A. Giacometti et al. 6 of 22 Digital Scholarship in the Humanities, 2015 reimaged, producing image sets acquired before and after damage. 4.3 The Parchment Imaged Multispectral images were acquired of every sample on two separate sessions in order to capture the samples in an untreated and treated state. The ima- ging station and equipment were not moved be- tween sessions. Each sample was imaged under three modalities which represent different ways of illuminating and capturing the images (MacDonald et al., 2013). The Nikon camera is a consumer camera which has a high pixel resolution and acquires colour images; the monochrome camera is a scientific camera which acquires greyscale images only (making it more convenient for multi- spectral imaging though a filter) and has increased sensitivity to the infrared, but has a lower pixel Table 1 Summary of different types of degradation (in small capitals), giving the reason for the degradation, the circumstances in which it might occur, and the type of degradation Degradation reason Circumstances Mechanical damage Chemical, biological, or environmental damage Damage by extraneous substances Technological mistakes During manufacture Lime solution, acidity, finishing Scraping Hydrochloric Acid, Calcium Hydroxide Oil During writing Ink acidity Sulphuric Acid During binding Unsympathetic binding Mechanical Damage Storage Environmental changes Temperature, humidity Heat, Desiccant, Mould Exposure to light Visible light, UV UV Light, Controls Pollutants, dirt Chemical reactions Smoke, Sulphuric Acid, Controls Natural disasters Fire, smoke, water Heat, Smoke, Water Biodegradation Micro-organisms, insects, rodents MOULD Mechanical destruction Rubbing, folding Mechanical Damage Use Erasures, changes to text Corrections, re-usage Scraping Iron Gall Ink Mishandling, misuse Mechanical Damage, Scrunching Accidents Spillage Blood, Red Wine, Black Tea, Iron Gall Ink, Water Blood, Oil, Red Wine, Black Tea, Aniline Dye, Iron Gall Ink, Indian ink Repairs Historical, conservation treatments Water, Sodium Hypochlorite Rebinding Unsympathetic binding Mechanical Damage Palaeographical and conservation experiments Palimpsest text recovery, bleaching Hydrochloric Acid Oil Reformatting, digitisation Mechanical Damage UV Light Natural ageing Controls In the table are also highlighted the kinds of degradation that naturally occurred to the manuscript during our experiments; these are identified by the keyword Controls. Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 7 of 22 resolution. Using both reflective and transmission imaging allowed both surface and deeper features to be detected. (1) Colour reflective imaging. This maximises sensitivity to surface features on the parchment. A Nikon D200 camera was mounted on a copy- stand with white tungsten-halogen incandescent lighting set at 458 (these emit from 400 to 1,000 nm, with peak power from 500–750 nm; see Fig. 4). Sixteen bandpass filters with centre wavelengths 400–700 nm and bandwidth 20 nm (Unaxis Optics, USA) were fixed in turn to the front of the camera lens. The camera was fitted with a Nikkor 105 mm f2.8 macro lens and set to an aperture of f8 throughout. (2) Colour transmission imaging. This used the same camera, but the light source was a light- box beneath the parchment, ensuring that only light which had passed through the parchment was detected, thereby increasing sensitivity to deeper regions of the parchment, and avoiding specular reflections. Its fluorescent lamps had a narrower bandwidth and did not provide mea- sureable power above 800 nm (which did not affect the colour imaging, because our longest wavelength filter was 700 nm). The same sixteen bandpass filters and the same camera lens were used. We also imaged each sample with no filter, giving 17 captures per sample. (3) Monochromatic reflective. In order to detect light in the near infrared, we used a monochrome camera (Kodak Megaplus 1.6i) which did not have an infrared cutoff filter. This enabled us to investigate longer wave- lengths using five additional infrared bandpass filters from 750–950 nm, centred every 50 nm. This camera has a lower spatial resolution than the Nikon (1,534�1,024 pixels compared to 3,872�2,592 pixels) and a lower bit depth (12 bits compared to 16 bits). The lens was a Nikkor 50 mm f2, which was set at a constant aperture of f/5.6 for all images. The sampling resolution on the surface of the parchment was approximately 12 pixels/mm (300 dpi). A total of 2,800 images were acquired in two ima- ging sessions, requiring careful data management processes to name each sample and record metadata about the process (Table 2; Giacometti, 2013). 4.4 Image Processing The analysis of images created by our experimental approach is important in several ways, as we can use various image processing algorithms to produce Fig. 3 A diagram of the location of the samples cut from both leaves of the iron gall ink on parchment manuscript. Each sample is 8 cm square, giving an overall impression of the size of the original parchment A. Giacometti et al. 8 of 22 Digital Scholarship in the Humanities, 2015 estimates of the original writing from the degraded samples, and calculate how different our results are from the untreated samples (this is demonstrated in Fig. 6). There is a variety of image processing methods suitable for this task, including K-means clustering9, Principal Components Analysis (PCA)10, Independent Components Analysis (ICA)11, and Linear Spectral Mixture Analysis Fig. 4 Imaging set up for colour reflective imaging. The camera is locked facing vertically downwards. The sample is placed on the copystand over a piece of black card under a sheet of anti-reflective glass. The process of acquiring the images involved manually exchanging each filter in front of the camera lens, causing small movements of the camera each time (which meant the resulting images required further image registration, see Giacometti, 2013) Table 2 Imaging details per session imaging system and lighting modality, and photograph counts Session Modality Samples Captures Number of images Before degradation Colour reflective 25 17 425 Colour transmission 25 17 425 Monochromatic reflective 25 22 550 Session total 1,400 After degradation Colour reflective 25 16 425 Colour transmission 25 16 425 Monochromatic reflective 25 21 550 Session total 1,400 Total 2,800 For each sample, one photograph was taken with each of 16 filters, plus one photograph without a filter. Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 9 of 22 (LSMA)12, although early on in this research we demonstrated the limited success of K-means clus- tering for our application (Giacometti, 2013). We therefore processed the multispectral images of each treated sample using three different methods: PCA, ICA, and LSMA13. First, data were cleaned and prepared: each monochrome image was cropped to a square area of 980�980 pixels, and the stack of images repre- senting the same sample were co-registered so that each pixel represents the same coordinates on the physical sample. This corrected for movement in the camera as lens filters were changed, and also for some degradation treatments which caused the sam- ples to shrink or curl14. We used a two-step regis- tration process. An initial linear step preformed gross realignment and a subsequent non-linear step provided non-rigid registration (Giacometti, 2013). Both steps were implemented using software called NiftiReg (Modat et al., 2010). This resulted in a series of images of the treated sample which could be directly compared pixel-by-pixel to images of the untreated sample. The three image analysis proced- ures were applied to these co-registered images. The output of these consisted of a series of individual images, each of which was intended to show one feature of the sample, for example we might have four images or ‘layers’ representing two inks, a stain, and the underling parchment. A comparison of the recovered images was then performed. Our aim was to be as objective and quantitative as possible, which was made difficult because the layers obtained from PCA and ICA show degrees of correlation rather than similar intensities. For example, ink might appear as dark on a light background in the sample, but the layer representing ink might appear as a light pattern on a dark background. A straightforward similarity measure such as least squares similarity would give a poor result in this case even if the ink had been perfectly identified. We therefore used an approach known as ‘mutual information’ which is a quanti- tative measure of the amount of information shared between two images, independent of their colour or intensity (Wells et al., 1996; Panagiotou, 2010)15. In our case, a successful identification of the text would give a high mutual information score if the patterns were correctly identified, irrespective of the colour assigned to the recovered layer. 4.5 Controlled Dmage We describe three of the twenty methods of destruc- tion in more detail below. These three examples were chosen to illustrate different classes of damage: one where the ink was physically removed (and therefore no longer remains on the document); one where the ink remains in its original form but has been obliterated by a stain; and finally one where the ink remains but has been chemically altered by a bleaching agent. 4.5.1 Scraping Forcibly removing the surface layer and the visible writing off the parchment with pumice stone16 was a commonly employed method to re-use parchment (Diringer, 1953; Netz and Noel, 2007), and therefore is a common problem for which multispectral ima- ging may help in recovering the erased text. We divided our sample into three areas, leaving the top area untreated, the middle area scraped gently, and the bottom area scraped thoroughly for a longer period. The scraping was undertaken by beating a pumice stone until it became powder, and, using a piece of cotton wool, scraping the parchment with this powder, in circular motions. 4.5.2 Blood Bloodstains are not uncommon in historical docu- ments (Wechsler, 1952), and historical texts can even be written in blood (Gurkina and Rebrikova, 2001; Kieschnick, 2001). Moreover, blood looks similar to iron gall ink, as both haemoglobin and iron gall are rich in the same colour-carrying iron oxides, which contributes to the difficulty of separ- ating ink from bloodstains. In criminology, multi- spectral imaging has been found of assistance when dating blood stains at crime scenes (Edelman et al., 2012). We stained our parchment with human blood (obtained as expired blood from the UCLH Blood Transfusion Unit), until it fully penetrated the parchment. The excess blood was removed with blotting paper. A. Giacometti et al. 10 of 22 Digital Scholarship in the Humanities, 2015 4.5.3 Sodium hypochlorite Sodium hypochlorite is a strong alkaline substance and a common bleaching agent. Reports of experi- ments with both paper and parchment describe the bleaching effects of both the parchment turning pink and the iron gall ink fading (Smith, 2012). There are also anecdotes of unscrupulous bleaching of parchment by curators and conservators in an attempt to read them (Blagden, 1787; Fuchs, 2003). In our method, 10–15% sodium hypochlorite (NaOCl) was diluted in 40 mL of de-ionised water (pH-5.5) until a pH of 13.0 was reached. This solu- tion was then applied to the parchment. 5 Results We describe the results of imaging our three meth- ods of destruction in more detail below. We concentrate here on results obtained from mono- chromatic reflective imaging, as the analysis of these images has produced useful outputs. See Giacometti, 2013 for further research on our other imaging modalities. 5.1 Example 1: Scraping Figure 5 shows the initial sample, a photograph of the sample after scraping and the best recovered image. The scraping has rendered the affected areas illegible to the naked eye (centre of Fig. 5). There are traces of text remaining on the middle third of the sample, where the scraping was lighter but the bottom third has very few marks where the writing used to be. On the top third, where a single word was removed, there is a darker mark on the parchment, but no traces of the word can be seen. The degradation to the integrity of the parchment is also visible in the samples: the parchment has become physically thinner in the scraped areas. Figure 6 shows the recovered component images from the sample shown in Fig. 5. The first row of Fig. 6 shows the four largest principal components of the images of the untreated samples. It can be seen that the pattern of the text is recovered suc- cessfully, but the intensity is not faithfully recov- ered—the text is white on black whereas in the original sample (see Fig. 5), it is black on white. This illustrates the effect described in section 4.4 and the necessity of a technique like mutual infor- mation which is sensitive to the overall patterns but not the absolute values to compare the images quantitatively. Each subsequent row displays four registered recovery images resulting from one of the three image processing algorithms (from rows 2–4, PCA, ICA, and LMSA, respectively). The mutual information between images of the damaged and undamaged samples ranged from zero where there was no information shared between a pair of images to 0.247, which was the highest quan- titative similarity and which occurred between NO-PC1 and SC-ICA3 in Fig. 6 above17. In this sample, then, the treatment has signifi- cantly affected the writing, but there is information that can be recovered using multispectral imaging, and we can visualise some recovery of the writing, while quantitatively indicating the success of ICA to achieve the best results. 5.2 Example 2: Blood On our sample section shown in Fig. 7, the blood treatment has obscured the writing on the bottom half of the sample. The bloodstains are of a dark brownish colour. The lack of contrast between the stained parchment and the ink renders the writing almost illegible. The treatment has also changed the geometry of the sample, shrinking it slightly. This effect has been corrected in the recovery image by co-registering it with the image of the sample before the treatment shown in Figure 8. In this case, the affected area of the sample that holds ink is stained, and any interpretation of the sample is difficult on the unprocessed images. The highest mutual information measure was 0.254 and occurred between the unprocessed image and BL-PC2. The mutual information between the unprocessed image and BL-LS3 was almost equally high (0.240). Our systematic data gathering enables further comparison. We can look at the intensity of each pixel as a function of wavelength (Fig. 9). The top part of Fig. 9 shows the untreated parchment. The band of pixels with high reflectance at all wavelengths corresponds to the background, or the parchment substrate. A smaller, less clustered Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 11 of 22 group which corresponds to the writing is darker at shorter wavelengths, but has similar reflectance to the parchment around infrared. However, after the treatment is applied (bottom of Fig. 9), a second band of pixels corresponding to the blood appears and follows a similar path to that of the ink, with the response across the spectrum appearing similar. This demonstrates that the wavelength dependence of blood is similar to that of much of the ink and indicates why the blood makes the ink difficult to read (see also the spectral reconstruction in MacDonald et al., 2013). In this case, the recovery estimate shows a clear trace of the writing with identifiable letters, even though the image of the treated sample looks, to the human eye, too obscured to be legible. Figure 9 shows that the spectrum of blood is similar to that of ink, so even multispectral imaging might not be com- pletely successful. However, we have shown that PCA is capable of enhancing text, and that PCA is the first image processing method that should be tried when faced with such damage. LMSA also performed well and should be considered if good knowledge of the spectra of the blood and ink are available. 5.3 Example 3. Sodium Hypochlorite The writing on the sample treated with sodium hypochlorite has become very faint (see the bottom part of the central image in Fig. 10). Both the parchment and the writing have become lighter. The areas where the writing has remained visible appear to be where there are stronger ink marks, suggesting that the ink penetrated deeper into the structure of the parchment. During the application of the treatment, the parchment started to lose structural integrity and some small particles sepa- rated from the sample. The treatment was stopped earlier than planned because of this, as the sample needed to be preserved intact for imaging purposes. After drying, the sample became stable again. However, it remained in a fragile state, lighter to the touch, more transparent, smoother, and more flexible than before treatment. In this case (Fig. 11), the maximum mutual in- formation was 0.294 and occurred for SH-PC1. The second highest occurred for SH-IC4 (0.237), Even the best recovered estimate (right image in Fig. 10) appears to be very similar to the image of the treated sample (middle image of Fig. 10), and our methods have not been able to extract any further informa- tion from the text. Multispectral imaging cannot recover text in every example of damage, and our systematic investigations indicate where it is worth- while using multispectral imaging, and where it simply will not recover any additional informa- tion from damaged and deteriorated texts. Our research indicates that text damaged by bleaching agents such as sodium hypochlorite might not gen- erate useful results when imaged multispectrally: this could inform cost analysis on when it may, or may not be, worth imaging particular texts which have indications of this sort of damage. Further results and discussion regarding all of our degradation techniques, and the results from Fig. 5 Sample O609R (left to right) before treatment, after treatment (scraping), and the best possible recovery estimate using our methods (third independent component) A. Giacometti et al. 12 of 22 Digital Scholarship in the Humanities, 2015 both capture and processing are catalogued in Giacometti (2013), providing a framework in which to understand the successes and limitations of multispectral imaging and the image processing algorithms used. 6 Discussion Our approach—acquiring multispectral images of historical parchment from a set of samples before and after they were submitted to various forms of Fig. 6 Sample 0609R (scraping) image processing results by three different processing algorithms. The top row includes PCA of the untreated samples, the next row the PCA of the treated samples. Row three indicates the results further processed with ICA, row four shows LSMA: both are highlighting similarities between the original and the treated samples Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 13 of 22 degradation—is novel. In this work, we attempted to recover writing from multispectral images, whilst objectively and quantitatively evaluating the effect- iveness of image processing algorithms. Our ap- proach successfully identifies the samples which contain more mutual information shared with the original text, and successfully ranks partial recovery of information. The effect of each of the twenty treatments on both the parchment and the visibility of the writing it carries varied significantly. In some samples in which the writing has been rendered unreadable by the treatment, the writing can be recovered, including aniline dye, oil, and blood. In some samples the writ- ing is completely obscured or the parchment has been severely affected and recovery is all but impos- sible, including iron gall ink, India ink, and mould. In most cases, however, the image processing algo- rithms can extract more information from the multi- spectral images of treated samples corresponding to the writing than the human eye can see. PCA outperformed ICA and LSMA as the image processing means by which to produce accurate re- covery estimates for almost all the samples (although one of our examples shown above, the blood stained fragment, was more successfully recovered with ICA). This shows that there is not one approach or algo- rithm which suits all types of document degradation, and that the specific condition of a document affects the processing methods which should be used on resulting images. However, PCA is a standard pro- cessing algorithm which appears to be accurate and robust in this application, and is therefore recom- mended to be used as the first in a range of processes when analysing multispectral images. Further pro- cessing may yield improved results. Our research depends on deliberately degrading square samples cut from a real historical iron gall ink manuscript on parchment. This degradation was necessary to model the type of documentary damage commonly seen in historical documents, and to understand how they affect the reading and inter- pretation of writing, both before and after multi- spectral imaging of the samples. The critical destruction is therefore a core part of our method, as it is central to a complete understanding of the effectiveness of multispectral imaging on primary historical texts. However, our approach does not provide system- atic information about any single degradation cause. There is much research to follow on from this, given that we have shown that using carefully prepared historical evidence can provide an effective frame- work for the evaluation and analysis of the applica- tion of multispectral imaging. Additional analysis of our data is possible, and we have already carried our further research into the estimation of spectral sig- natures of the materials present in the documents from the collected multispectral images (MacDonald et al., 2013). We envisage that the dataset has the potential to become an invaluable asset for libraries and archives, research in conser- vation, and various problems in image and signal processing, and have made all of the data generated Fig. 7 Sample I208R (left to right) before treatment, after treatment (blood), and the best possible recovery estimate using our methods (second principal component) A. Giacometti et al. 14 of 22 Digital Scholarship in the Humanities, 2015 from this project available for use by others18. Our dataset provides physical information of how parch- ment reacts to various forms of degradation, and also provides documentation on acquisition, and will provide a resource for future research (reducing the need for experimentation on valuable primary historical texts). Our next step will be to carry out a similar process concentrating solely on systematic- ally reproducing different degrees of an individual type of degradation (such as water damage or smoke and heat) to provide further information to help both future conservation and digitisation efforts. Fig. 8 Sample I208R (blood) image processing results by processing algorithm. The top row includes PCA of the untreated samples, the next row the PCA of the treated samples. Row three indicates the results processed with ICA, row four shows LSMA: both highlight similarities between the original and the treated samples Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 15 of 22 Fig. 9 Sample 1208R (blood) spectral intensity against wavelength. Above: before treatment. Below, after treatment Fig. 10 Sample O605R (left to right) before treatment, after treatment (sodium hypochlorite), and the best possible recovery estimate using our methods (first principal component) A. Giacometti et al. 16 of 22 Digital Scholarship in the Humanities, 2015 7 Conclusion As multispectral imaging becomes more frequently used in the cultural and heritage sectors, it is im- portant to understand the framework which under- pins its application to the capture and analysis of primary historical texts. Our research has provided a systematic methodology for the continuing study and evaluation of the techniques involved in the analysis and processing of multispectral images of degraded cultural heritage documents, and a basis for further testing and development. Understanding Fig. 11 Sample O605R (sodium hypochlorite) image processing results by processing algorithm. The top row includes PCA of the untreated samples, the next row the PCA of the treated samples. Row three indicates the results further processed with ICA, row four shows LSMA Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 17 of 22 the most efficient way to apply these techniques to damaged and abraded texts is central to ensuring that the images created when using multispectral imaging—which becomes evidence to be used by a range of scholars including historians, palaeog- raphers, and papyrologists—can be trusted by researchers, whilst also making the most efficient use of resources. Our systematic approach provides a framework for the analysis of deteriorated docu- ments using multispectral techniques. Carrying out this type of interdisciplinary research facilitates a deeper understanding of the artefacts, multispectral imaging, and image process- ing methods. Specifically, it provides a methodology for the continuing study of the techniques involved in the analysis and processing of multispectral images of degraded cultural heritage documents, and a framework for further testing and develop- ment. It has required input from conservators, digitisation specialists, medical physicists, engineers, and computer scientists, archivists all collaborating in a Digital Humanities project where aspects of computing are advanced as much as our under- standing of a process that can be useful for huma- nities scholars. Our unique methodology, where the destruction of a historical text is necessary to ac- quire experimental data for evaluation, can now be used to evaluate a process for reading other, more valuable, historical texts. Our combined crit- ical approach to a developing technology allows us to advise and steer the application of multispectral techniques to primary historical texts. Funding This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/F01208X/1]. We would like to thank London Metropolitan Archives for donating the parchment manuscript which allowed us to carry out this research. References Attas, E. M. (2004). Enhancement of document legibility using spectroscopic imaging. Archivaria, 57: 131–46. Balas, C., Papadakis, V., Papadakis, N., Papadakis, A., Vazgiouraki, E. and Themelis, G. (2003). A novel hyper-spectral imaging apparatus for the non- destructive analysis of objects of artistic and historic value. Journal of Cultural Heritage, 4(S1): 330–7. Barnett, T., Chalmers, A., Diaz-Andreu, M., Ellis, G., Longhurst, P., Sharpe, K. and Trinks, I. (2005). 3D laser scanning for recording and monitoring rock art erosion. International Newsletter on Rock Art, 41: 25–9. Baumann, R., Porter, D. C. and Seales, W. B. (2008). The use of Micro-CT in the study of archaeological artifacts. 9th International Conference on NDT of Art. Jerusalem, Israel. Blagden, C. (1787). Some observations on ancient inks, with the proposal of a new method of recovering the legibility of decayed writings: by Charles Blagden, M. D. Sec. R. S. and F. A. S. Philosophical Transactions of the Royal Society of London, 77: 451–7. Bonanni, L., Xiao, X., Hockenberry, M., Subramani, P., Ishii, H., Seracini, M. and Schulze, J. (2009). Wetpaint: scraping through multi-layered images. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ‘09), New York, NY: ACM, pp. 571–4. http://doi.acm.org/10.1145/ 1518701.1518789. doi¼10.1145/1518701.1518789. Chabries, D. M., Booras, S. W. and Bearman, G. H. (2003). Imaging the past: recent applications of multi- spectral imaging technology to deciphering manu- scripts. Antiquity, 77(296): 359–72. Chahine, C. (2000). Changes in hydrothermal stability of leather and parchment with deterioration: a DSC study. Thermochimica Acta, 365(1–2): 101–10. Clarkson, C. (1992). Rediscovering parchment: the nature of the beast. The Paper Conservator, 16(1): 5–26. Conway, P. (2008). Best practices for digitizing photo- graphs: a network analysis of influences. Proceedings of IS&T’s Archiving 2008, Imaging Science and Technology, Berne, 24–27 June. Crowther, C., Nyhan, J., Tarte, S. and Dahl, J. (2014). New and recent developments in image analysis: theory and practice. Panel Session, Digital Humanities 2014. http://dharchive.org/paper/DH2014/Panel-759.xml Deegan, M. and Tanner, S. (2002), Digital Futures: Strategies for the Information Age. London: Library Association Publishing. Diringer, D. (1953). The Book Before Printing: Ancient, Medieval, and Oriental. Mineola, NY: Courier Dover Publications. A. Giacometti et al. 18 of 22 Digital Scholarship in the Humanities, 2015 http://doi.acm.org/10.1145/1518701.1518789 http://doi.acm.org/10.1145/1518701.1518789 http://dharchive.org/paper/DH2014/Panel-759.xml DH2014 (2014). Plenary sessions, panels, long papers, short papers, posters and workshops at digital huma- nities 2014. http://dharchive.org/ Dobrusina, S. A. and Visotskite, V. K. (1994). Chemical treatment effects on parchment properties in the course of ageing. Restaurator, 15(4): 208–19. Dolgin, B., Bulatov, V. and Schechter, I. (2007). Non- destructive assessment of parchment deterioration by optical methods. Analytical and Bioanalytical Chemistry, 388(8): 1885–96. Earl, G., Martinez, K. and Malzbender, T. (2010). Archaeological applications of polynomial texture map- ping: analysis, conservation and representation. Journal of Archaeological Science, 37(8): 2040–50. Easton, R. L., Jr., Knox and, K. T. and Christens-Barry, W. A. (2003). Multispectral imaging of the Archimedes palimpsest. Proceedings of the 32nd Applied Imagery Pattern Recognition Workshop, San Jose, California, pp. 111–6. Easton, R. L., Jr., Knox, K. T., Christens-Barry, W. A., Boydston, K., Toth, M. B., Emery, D. and Noel, W. (2010). Standardized system for multispectral imaging of palimpsests. Proceedings of SPIE 7531, Computer Vision and Image Analysis of Art 75310D.111, San Jose, California. Edelman, G., van Leeuwen, T. G. and Aalders, M. C. (2012). Hyperspectral imaging for the age estimation of blood stains at the crime scene. Forensic Science International, 223(1-3): 72-7. Everdell, N. L., Styles, I. B., Claridge, E., Hebden, J. C. and Calcagni, A. S. (2009). Multispectral imaging of the ocular fundus using LED Illumination. In Depeursinge, C. and Vitkin (eds), Novel Optical Instrumentation for Biomedical Applications IV, Vol. 7371. Proceedings of SPIE-OSA Biomedical Optics. Optical Society of America, Munich, German. Fuchs, R. (2003). The history of chemical reinforcement of texts in manuscript - what should we do now? In Fellows-Jensen, G. and Springborg, P. (eds), Care and Conservation of Manuscripts 7: Proceedings of the Seventh International Seminar Held at the Royal Library, Vol. 7. Copenhagen, Denmark: Museum Tusculanum Press. Giacometti, A., Campagnolo, A., MacDonald, L., Mahony, S., Terras, M., Robson, S., Weyrich, T. and Gibson, A. (2012). Documenting Parchment Degradation via Multispectral Imaging. Proceedings of BCS Conference on Electronic Imaging and the Visual Arts (EVA), London, pp. 301–8. Giacometti, A. (2013). Evaluating Multispectral Imaging Processing Methdologies for Analysing Cultural Heritage Documents. Ph.D. thesis, University College London, forthcoming. Giacometti, A., Terras, M. and Gibson, A. (2015). Objectively evaluating text recovery methodologies for multispectral images of palimpsests. International Journal of Heritage in the Digital Era, 15th Issue dedi- cated to Computer Vision in Cultural Heritage. Giurginca, M., Lacatusu, I. and Miu and I. Petroviciu, L. (2009). Parchment behaviour under extreme heat and fire conditions, 13(3): 337–9. Goltz, D.M., Cloutis, E, Norman, L. and Attas, M. (2007). Enhancement of faint text using visible (420-720 nm) multispectral imaging, Restaurator, 2007:11–28. Goltz, D. and Hill, G. (2012). Hyperspectral Imaging of Daguerreotypes. Restaurator: International Journal for the Preservation of Library and Archival Material, 33(1):1–16. Gonzalez, R. C. and Woods, R. E. (1993). Digital Image Processing, Reading, Massachusetts: Addison-Wesley Publishing. Góra, M., Pircher, M., Götzinger, E., Bajraszewski, T., Strlic, M., Kolar, J., Hitzenberger, C. K. and Targowski, P. (2006). Optical coherence tomography for examination of parchment degradation. Laser Chemistry, 68: 1–6. Gurkina, S. and Rebrikova, N. (2001). Treatment of parchment fragments of a Hebrew Bible. Restaurator, 22(3): 181–6. Gray, R. and Neuhoff, D. (1998). Quantization. IEEE Transactions on Information Theory, 44(6): 2325–83. Hardeberg, J. Y., Schmitt, F. and Brettel, H. (2002). Multispectral color image capture using a liquid crystal tunable filter. Optical Engineering, 41(10): 2532–48. Hartigan, J. A. and Wong, M. A. (1979). Algorithm AS 136: a K-Means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1): 100–8. Heinz, D. and Chang, C. I. (2001). Fully constrained least squares linear spectral mixture analysis method for ma- terial quantification in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 39(3): 529–45. Hill, D. L. G., Batchelor, P. G., Holden, M. and Hawkes, D. J. (2001). Medical image registration. Physics in Medicine and Biology, 46(3): R1–45. Hollaus, F., Gau, M. and Sablatnig, R. (2013). Acquisition and Enhancement of Multispectral Images of Ancient Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 19 of 22 http://dharchive.org/ Manuscripts. Berlin, Germany: Kultur und Informatik: Visual Worlds and Interactive Spaces, pp. 187–97. Hughes, L. (2004), Digitizing Collections: Strategic Issues for the Information Manager. London: Facet Publishing. Hyvärinen, A., Karhunen, J. and Oja, E. (2001). Independent Component Analysis. New York, NY: John Wiley and Sons. Information in Images (2014). Multispectral document imaging. www.informationinimages.com/#!multispec- tral-document-scanning/c1yhe Jolliffe, I. T. (2002). Principal Component Analysis. New York, NY: Springer-Verlag. Joo Kim, S., Deng, F. and Brown, M. S. (2011). Visual enhancement of old documents with hyperspectral imaging. Pattern Recognition, 44(7): 1461–9. Kennedy, C. J. and Wess, T. J. (2006). Chapter 4 The use of X-ray scattering to analyse parchment structure and degradation. In David, B. and Dudley, C. (ed.), Physical Techniques in the Study of Art, Archaeology and Cultural Heritage. Elsevier, pp. 151–72. Kieschnick, J. (2001). Blood writing in Chinese Buddhism. Journal of the International Association of Buddhist Studies, 23(2): 177–94. Knox, K. T. (2008). Enhancement of overwritten text in the Archimedes Palimpsest. Proc. SPIE 6810, Computer Image Analysis in the Study of Art, 681004 (29 February 2008); doi: 10.1117/12.766679. Knox, K. T., Easton, R. L., Jr., Christens-Barry, W. A. and Boydston, K. (2011). Recovery of handwritten text from the diaries and papers of David Livingstone. Proceedings of SPIE 7869, Computer Vision and Image Analysis of Art II 786909, pp. 1–7. Landgrebe, D. (1999). Information extraction principles and methods for multispectral and hyperspectral image data. In Chen, C. (ed.), Information Processing for Remote Sensing. River Edge, NJ: World Scientific Publishing Company, pp. 3–38. Larsen, R. (2007). Introduction to damage and damage assessment. In Larsen, R. (ed.), Improved Damage Assessment of Parchment (IDAP): Assessment, Data Collection and Sharing of Knowledge, 1st edn. European Commision, Directorate- General for Environment, pp. 17–21. Luccheseyz, L. and Mitray, S. K. (2001). Color image segmentation: a state-of-theart survey. Proceedings of the Indian National Science Academy (INSA-A), Delhi, Indian: National Science Academy, 67(2): 207–21. MacDonald, L. and Jacobsen, R. (2006). Assessing image quality. In MacDonald, L. (ed.), Digital Heritage, Applying Digital Imaging to Cultural Heritage. Oxford: Butterworth-Heinenmann, pp. 351–74. MacDonald, L. W., Giacometti, A., Campagnolo, A., Robson, S., Weyrich, T., Terras, M. and Gibson, A. (2013). Multispectral imaging of degraded parchment. Computational Color Imaging, 4th International Workshop, CCIW 2013, Chiba, Japan, 3-5 March 2013. Proceedings In Tominaga, S., Schettini, R., and Trémeau, A. (eds), Lecture Notes in Computer Science, Vol. 7786, Chiba, Japan: Springer Berlin Heidelberg. Marengo, E., Manfredi, M., Zerbinati, O., Robotti, E., Mazzucco, E., Gosetti, F., Bearman, G., France, F. and Shor, P. (2011). Development of a technique based on multi-spectral imaging for monitoring the conservation of cultural heritage objects. Analytica Chimica Acta, 706(2): 229–37. Meghea, A., Giurginca, M., Iftimie, N., Miu, L., Viorica, B. and Budrugeac, P. (2004). Behaviour to accelerate ageing of some natural biopolymer constituents of parchment. Molecular Crystals and Liquid Crystals, 418(1): 285–90. Modat, M., Ridgway, G. R., Taylor, Z. A., Lehmann, M., Barnes, J., Hawkes, D. J., Fox, N. C. and Ourselin, S. (2010). Fast free-form deformation using graphics pro- cessing units. Computer Methods and Programs in Biomedicine, 98(3): 278–84. Netz, R. and Noel, W. (2007). The Archimedes Codex: How a Medieval Prayer Book Is Revealing the True Genius of Antiquity’s Greatest Scientist. 1st edn. Da Capo Press, London. Panagiotou, C. (2010). Information Theoretic Regularization in Diffuse Optical Tomography. Ph.D. thesis. London: University College London. Peatross, J. and Ware, M. (2013). Physics of Light and Optics. Provo: Brigham Young University Ponto, K., Seracini, M. and Kuester, F. (2009). Wipe-off: an intuitive interface for exploring ultra-large multi- spectral data sets for cultural heritage diagnostics. Computer Graphics Forum, 28(8): 2291–301. Rapantzikos, K. and Balas, C. (2005). Hyperspectral ima- ging: potential in non-destructive analysis of palimp- sests. IEEE International Conference on Image Processing, 2. Ramsay, S. and Rockwell, G. (2012). Developing things: towards and epistemology of building in the digital humanities. In Gold, M. K. (ed.), Debates in the A. Giacometti et al. 20 of 22 Digital Scholarship in the Humanities, 2015 www.informationinimages.com/#!multispectral-document-scanning/c1yhe www.informationinimages.com/#!multispectral-document-scanning/c1yhe Digital Humanities. Minneapolis: University of Minnesota Press, pp. 75–84. Ratto, M. (2011). Critical making: conceptual and mater- ial studies in technology and social life. The Information Society, 27(4): 252–60. Reed, R. (1972). Ancient Skins, Parchments and Leathers. London, UK: Seminar Press. Salerno, E., Tonazzini, A. and Bedinin, L. (2007). Digital image analysis to enhance underwritten text in the Archimedes palimpsest. International Journal of Document Analysis and Recognition (IJDAR), 9(2–4): 79–97. Schuman, R. (2014). I tweeted a joke that started a big Ass Ruckus: pan kisses Kafka. http://pankisseskafka.com/ 2014/01/08/i-tweeted-a-joke-that-started-a-big-ass- ruckus/ (accessed 8 January 2014). Senvaitenë, J., Beganskienë, A., Tautkus, S., Padarauskas, A. and Kareiva, A. (2005). Characterization of histocial writing inks by different analytical techniques. Chemija, 16(3–4): 34–8. Smith, T. (2012). An evaluation of historical bleaching with chlorine dioxide gas, sodium hypochlorite, and chloramine-T at the Fogg art museum. Restaurator, 33(3–4): 249–73. Tanner, S. and Bearman, G. (2009). Digitising the Dead Sea Scrolls: Archiving 2009. Arlington, VA: The Society for Imaging Science and Technology, pp. 119–23. Terras, M. (2006a). Image to Interpretation: Intelligent Systems to Aid Historians in the Reading of the Vindolanda Texts. Oxford Studies in Ancient Documents, Oxford University Press, Oxford. Terras, M. (2006b). Disciplined: Using educational stu- dies to analyse humanities computing. Literary and Linguistic Computing, 21(2): 229–46. Terras, M. (2008). Digital Images for the Information Professional. London: Ashgate. The National Archives (2015). Deaccessioning and dis- posal: guidance for archive services. www.nationalarc- hives.gov.uk/documents/Deaccessioning-and-disposal- guide.pdf Vnouček, J. (2007). Typology of the damage of the parchment in manuscripts of the codex form. In Larsen, R. (ed.), Improved Damage Assessment of Parchment (IDAP): Assessment, Data Collection and Sharing of Knowledge, 1st edn. European Commission, Directorate- General for Environment, Luxembourg, pp, 27–30. Wells, W. M., Jr III., Viola, P., Atsumi, H., Nakajima, S. and Kikinis, R. (1996). Multimodal volume registra- tion by maximization of mutual information. Medical Image Analysis 1.1, 1(1): 35–51. Weingart, S. (2013a). Acceptances to digital humanities 2013 (part 1). www.scottbot.net/HIAL/?p¼35242 (ac- cessed 25 April 2013) Weingart, S. (2013b). Submissions to digital humanities 2014. www.scottbot.net/HIAL/?p¼39588 (accessed 5 November 2013). Weingart, S. (2014). Acceptances to digital humanities 2014 (part 1). www.scottbot.net/HIAL/?p¼40695 (accessed 10 April 2014). Wechsler, T. (1952). The origin of the so called dead sea scrolls. The Jewish Quarterly Review, 43(2): 121–39. Workman, J. and Weyer, L. (2007). Practical Guide to Interpretive Near-Infrared Spectroscopy. CRC Press, London. Zolfagharifard, E. (2014). Does the Bible have secrets to reveal? Scholars hope to restore hidden text in ancient New Testament manuscript. http://www.dailymail.co. uk/sciencetech/article-2752384/Scholars-hope-restore- hidden-text-ancient-New-Testament-manuscript. html#ixzz3LgzpVv7O Notes 1 An analysis of submissions to the Digital Humanities 2014 conference carried out by Scott Weingart demon- strates that work on text processing remains the core focus of the Digital Humanities community (Weingart, 2013b), with an analysis of DH2014 acceptances indi- cating that ‘Literary studies, text analysis, and text mining still reign supreme’ (Weingart, 2014). This fol- lows the same trends identified in Weingart’s analysis of Digital Humanities 2013 acceptances (Weingart, 2013a). An earlier analysis of the most used words in the ACH/ALLC Conference abstracts 1996–2005 (Terras, 2006b, p. 236) indicates that text was the focus on this earlier work of the Digital Humanities community. 2 At the Digital Humanities conference 2014, one of the eight panels was devoted to image processing (Crowther et al., 2014), and the program also contained a range of short and long papers dealing with image processing, optical character recognition, and the search, retrieval, and navigation of high resolution document image collections (DH2014, 2014). Evaluating multispectral image processing methods Digital Scholarship in the Humanities, 2015 21 of 22 http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/ http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/ http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/ www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf www.scottbot.net/HIAL/?p=35242 www.scottbot.net/HIAL/?p=35242 www.scottbot.net/HIAL/?p=39588 www.scottbot.net/HIAL/?p=39588 www.scottbot.net/HIAL/?p=40695 www.scottbot.net/HIAL/?p=40695 http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O 3 Other emerging techniques of interest to those aiming to recover information from primary historical sources include infrared or near-infrared imaging (Workman and Weyer, 2007), and three dimensional imaging such as micro-CT (Baumann et al., 2008), 3D laser scanning (Barnett et al., 2005), and Reflectance Transformation Imaging (RTI) (Earl et al., 2010). 4 These are the most popular ways to capture multispec- tral images in the heritage sector, although the cost of obtaining equipment can still be prohibitive for many institutions to undertake this sort of analysis. At the time of writing, a set of narrowband multispectral filters retails in the region of £10,000 (and will also require additional camera and lighting equipment to be able to be used with it: this is the system we use in this experiment). A full system for production and capture of specific light wavelengths currently retails for £80,000. Camera sensors that can select wave- lengths automatically have been developed (Balas et al., 2003), but these are not commercially available. Relatively low cost scanners have been developed that claim full multispectral capabilities, currently retailing for £2,000 (Information in Images, 2014), but these claims have not been verified by independent tests. 5 Although there are now over forty different guidelines in existence which detail best practice in straightfor- ward digitisation of cultural and heritage materials (Conway, 2008), none of them has described ideal approaches for the capture, analysis, and storage of multispectral images of heritage material. 6 Ratto, 2011; Ramsay and Rockwell, 2012; Schuman, 2014. 7 The creation of virtual models, or ‘phantoms’, to allow this comparison is also explored in detail in Giacometti (2013) and Giacometti et al., (2015). 8 www.lma.gov.uk 9 K-means clustering is a method to separate data points into a number (k) of clusters according to underlying shared characteristics. For example, an image showing parchment and two different inks might be separated into k¼3 clusters, so that the pixels representing the three ‘layers’ are identified sep- arately. See Hartigan and Wong (1979), Gray and Neuhoff (1998), and Luccheseyz and Mitray (2001). 10 PCA is a technique for decomposing a set of data into its intrinsic variability, preserving the maximum vari- ability of the data in fewer dimensions (Jolliffe, 2002). In the ideal case, each of the principal components would show one layer from the image. 11 ICA is designed to separate sources of signals from a series of measurements (Hyvärinen et al, 2001). Independent components are not ranked, and the energy of each dimension is not preserved, or mean- ingful. It behaves similarly to PCA but can give differ- ent results. Again, we would aim that each of the independent components shows a different layer from the image. 12 LSMA decomposes multispectral image data into layers of materials by using a priori knowledge of the spectral signals of materials that are present (Heinz and Chang, 2001). This requires knowledge of the ab- sorption spectrum of each dye, which might not always be available. 13 Further descriptions of these techniques and applica- tions are available in Chapter 2 of Giacometti (2013). 14 Non-linear or non-rigid transformations are those that affect one area of an image in a different way to other areas (Hill et al., 2001), thus allowing compensation when one side of the parchment has shrunk, etc. 15 The amount of information in an image can be for- mally calculated as the entropy of the image. A blank image has no information and has entropy¼0, whereas a completely random image carries maximum information (in that the value of one pixel cannot be predicted by that of its neighbours) and therefore has maximum entropy. The information shared between two images can be given as the joint entropy which increases as two images differ, because if the images are different, one cannot be used to predict the other. If the entropy H of an image X is H(X) and the joint entropy of images X and Y is H(X,Y), then we can define the mutual information I(X,Y) as the informa- tion shared between two images, or equivalently their similarity. Then, formally, I(X,Y)¼H(X)þ H(Y)�H(X,Y). 16 The procedure sometimes involved softening the parchment using a mixture of cheese, milk, and lime, before proceeding to scrape the writing using a knife or razor (Diringer, 1953). 17 Full data are available in Appendix C of Giacometti (2013). 18 The DOI for this dataset is 10.14324/000.ds.1469099 19 The figures included in this paper were originally pub- lished in Giacometti (2013). A. Giacometti et al. 22 of 22 Digital Scholarship in the Humanities, 2015 www.lma.gov.uk