OP-LLCJ150036 1..22


The value of critical destruction:
Evaluating multispectral image
processing methods for the
analysis of primary historical texts
............................................................................................................................................................

Alejandro Giacometti

Department of Medical Physics and Biomedical Engineering, UCL

Centre for Digital Humanities, University College London, London

Alberto Campagnolo

Ligatus Research Centre, CCW Graduate School, University of the

Arts London, London

Lindsay MacDonald

Photogrammetry, 3D Imaging and Metrology Research Centre,

University College London, London

Simon Mahony

UCL Centre for Digital Humanities, Department of Information

Studies, University College London, London

Stuart Robson

Photogrammetry, 3D Imaging and Metrology Research Centre,

University College London, London

Tim Weyrich

Department of Computer Science, UCL Centre for Digital

Humanities, University College London, London

Melissa Terras

Department of Information Studies, UCL Centre for Digital

Humanities, University College London, London

Adam Gibson

Department of Medical Physics and Biomedical Engineering,

University College London, London
.......................................................................................................................................

Abstract
Multispectral imaging—a method for acquiring image data over a series of wave-
lengths across the light spectrum—is becoming a valuable tool within the cultural

Correspondence: Melissa

Terras, Department of

Information Studies, Foster

Court, University College

London, Gower Street,

WC1E 6BT, London.

E-mail: m.terras@ucl.ac.uk

Digital Scholarship in the Humanities � The Author 2015. Published by Oxford University Press on behalf of EADH.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any
medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

1 of 22

doi:10.1093/llc/fqv036

 Digital Scholarship in the Humanities Advance Access published October 7, 2015

XPath error Undefined namespace prefix


and heritage sector for the recovery and enhancement of information contained
within primary historical texts. However, most applications of this technique, to
date, have been bespoke: analysing particular documents of historic importance.
There has been little prior work done on evaluating this technique in a structured
fashion, to provide recommendations on how best to capture and process images
when working with damaged and abraded textual material. This article intro-
duces a new approach for evaluating the efficacy of image processing algorithms
in recovering information from multispectral images of deteriorated primary
historical texts. We present a series of experiments that deliberately degrade
samples cut from a real historical document to provide a set of images acquired
before and after damage. These images then allow us to compare, both objectively
and quantitatively, the effectiveness of multispectral imaging and image process-
ing for recovering information from damaged text. We develop a methodological
framework for the continuing study of the techniques involved in the analysis
and processing of multispectral images of primary historical texts, and a dataset
which will be of use to others interested in advanced digitisation techniques
within the cultural heritage sector.

.................................................................................................................................................................................

1 Introduction

Multispectral imaging is an advanced digitisation
method for acquiring image data over a series of
wavelengths across the light spectrum. Combined
with image processing, it has become a valuable
tool for the enhancement and recovery of informa-
tion contained within culturally important docu-
ments, providing a means, in some cases, to
recover lost text, or examine other features no
longer detectable by the human eye. However,
applications of multispectral imaging within the
cultural and heritage sector have mainly been
bespoke, with limited access to or understanding
of the techniques and methods used to recover
damaged text. The barriers to accessing this technol-
ogy will become lower as the equipment becomes
commercially available; however, it is important
that we better understand the methods and
approaches used for multispectral imaging in
order to be able to use such techniques efficiently,
whilst maximising the information we can recover
from cultural objects.

This article describes a highly interdisciplinary
approach to evaluating multispectral imaging and
image processing in the context of primary histor-
ical sources. We introduce a formal methodology to
evaluate image processing of multispectral data and

provide a framework for developing new, best prac-
tice methods when using multispectral processes to
image damaged texts. We do so by first building up
a large dataset of multispectral images of actual
parchment, taken before and after a set of degrad-
ation procedures that were designed to match the
most likely types of damage which may occur over
the lifetime of parchment documents. This dataset
then allows us to evaluate the efficacy of image pro-
cessing algorithms attempting to recover damaged
text, and to make recommendations on how best to
apply multispectral imaging when attempting to
recover information from damaged text. Our
novel approach, which requires the necessary, con-
trolled destruction of a historical parchment docu-
ment, presents a formal methodology in acquiring,
processing, and analysing multispectral data. It also
led to the creation of a large dataset consisting of a
series of multispectral images showing both the
initial and degraded state of samples from a real
manuscript, providing a valuable tool for the
advanced digitisation research community. As
such, this article makes a major contribution to
our understanding of how multispectral imaging
can be used across the cultural and heritage sector,
and demonstrates how an interdisciplinary
approach centred on questions raised from within
a Digital Humanities project can advance our

A. Giacometti et al.

2 of 22 Digital Scholarship in the Humanities, 2015


understanding of image processing for both the
cultural heritage and engineering science sectors.

2 The Digital Humanities and
Imaging

Although most effort in the Digital Humanities is
focussed on the production, analysis, and visualisa-
tion of text1, there is a recent and growing interest
in the community towards digital imaging, and how
image capture and processing techniques can aid us
in uncovering new bodies of information, particu-
larly from historical documents2. Digital imaging
technology has been used to produce detailed and
trustworthy surrogates of historical documents for
decades (Deegan and Tanner, 2002; Hughes, 2004;
Terras, 2008), and digitized versions of primary his-
torical sources are often adequate for the needs of
most scholars. However, improvements in image
processing and analysis have led to a number of
exciting and important digital humanities projects
which can reveal a greater wealth of information
about the originals, beyond traditional digitisation
technologies. Leveraged by technological improve-
ments in image acquisition and image processing,
humanities scholars have been able to image, ana-
lyse, and recover more information from historical
texts (Chabries et al., 2003; Terras, 2006a; Salerno et
al., 2007; Tanner and Bearman, 2009). One of the
most promising techniques3 is multispectral ima-
ging, which can provide additional evidence of the
content of a document when it is difficult to read
with the naked eye, when further information about
the physical composition of a document and ink
identification is required (Senvaitenë et al., 2005),
or when information is required about its proven-
ance (Tanner and Bearman 2009).

3 Multispectral Imaging

Light is an electromagnetic wave, often characterised
by its wavelength (which we perceive as colour),
which is the distance between two consecutive
peaks of the wave. The spectrum that is visible to
humans includes wavelengths from approximately

380 nm to 760 nm (Fig. 1). Light with a wavelength
longer than 760 nm is referred to as infrared; ultra-
violet is light with wavelengths shorter than 380 nm
(Peatross and Ware, 2013). Most digital imaging
equipment captures the same broad spectra of light
that is visible to humans with a combination of
broadband red, green, and blue sensors (this is
hardly surprising, given that the outputs of most
imaging technologies are those which humans
should be able to see). In contrast, multispectral ima-
ging measures a series of discrete wavelengths over a
defined range. These images can be acquired in the
visible spectrum and also in the infrared and ultra-
violet spectrum (with images that include a broader
range of wavelengths often being referred to as hyper-
spectral (Landgrebe, 1999)).

Multispectral images are reasonably straightfor-
ward to acquire if appropriate light sources and
detectors are available along with a method for
wavelength selection. A spectrum of light is usually
selected through the use of filters, or via a light
source4. For example, a series of filters placed in
front of a camera lens can allow images to be cap-
tured in distinct wavebands (Hardeberg et al., 2002;
Attas, 2004; Rapantzikos and Balas, 2005) or, in a
more recent development, light sources can be used
which emit at specific wavelengths (Easton et al.,
2010; Marengo et al., 2011; Hollaus et al., 2013).
Images may then be acquired using a commercial
camera or a more sophisticated scanning system.
The resulting sets of images can show different
aspects of a document at different wavelengths
(Fig. 2).

Multispectral imaging was first developed by
NASA in the 1950s to determine the composition
of objects in space (Landgrebe, 1999) and more
recently has been used in medical imaging, for
example in imaging the interior of the eye
(Everdell et al., 2009). It is also incredibly useful
to help in reading documents: given that different
inks have different spectral signatures due to their
differing chemical composition, multispectral ima-
ging can be used to differentiate inks used in differ-
ent areas on a document, different depths within a
document (such as in the case of palimpsests) or to
differentiate ink from other types of document
damage, such as mould or abrasion. In the cultural

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 3 of 22


and heritage sectors, multispectral imaging has been
used across a range of documents including the
Archimedes Palimpsest (Salerno et al., 2007), the
Dead Sea Scrolls (Chabries et al., 2003; Tanner
and Bearman, 2009), carbonised scrolls from
Herculaneum (Chabries et al., 2003), letters from
the Hudson Bay Archives (Goltz et al., 2007), pal-
impsests from the Saint Catherine Monastery in
Egypt (Easton et al., 2010), improving tarnished
daguerreotypes (Goltz and Hill, 2012), removing ef-
fects of ink-bleeding, ink-corrosion, and foxing (Joo
Kim et al., 2011), and recovering the diaries of
David Livingstone (Knox et al., 2011). Although
multispectral imaging is currently the leading tech-
nique for recovering lost text in historical manu-
scripts, there are no guidelines which determine
when it is the most appropriate technique compared
to imaging at a single wavelength, or what the best
wavelengths to use are. One purpose of this work is
to establish a means to compare different imaging
approaches objectively so that such guidelines can
be evidence-based.

Most of the reported applications of this tech-
nique in the cultural and heritage sector are to spe-
cific documents of great historical importance.
Wider use of the technology is now inevitable as
more examples of successful recovery from multi-
spectral images of historical documents arise,

although a careful cost–benefit analysis is required
to consider the type of data that a multispectral
imaging project might yield, given the present
(but falling) costs of undertaking this kind of ima-
ging. As the availability of the technique is expected
to increase, it is important to consider best practice5

in capturing and processing multispectral images.
Questions remain as to how best to take advantage
of digital visualisation technology to present multi-
spectral image data of cultural heritage to historians
and palaeographers (Bonanni et al., 2009, Ponto et
al., 2009). In addition, there is little evidence avail-
able about how best to process or analyse multispec-
tral images (Giacometti, 2013). Further processing
(the computational manipulation of digital images,
see Gonzalez and Woods (1993) for an introduc-
tion) of multispectral images can allow important
historical features and details to be identified,
enhanced, and separated from other features, and
it is important to understand what image processing
approaches are most useful when dealing specifically
with multispectral images of particular types of
damage found on primary historical texts.
Additionally, multispectral imaging can be misun-
derstood, with the technology sometimes being
described as if it were magic (for example, see
Zolfagharifard, 2014), and there is a need for a sys-
tematic investigation into the effectiveness and

Fig. 1 Multispectral images are captured in a similar process to colour images, but with many images captured at
discrete narrow ranges of the light spectrum, rather than a small number of images which are each sensitive to light at a
broader range of wavelengths19

A. Giacometti et al.

4 of 22 Digital Scholarship in the Humanities, 2015


usefulness of the technique for the cultural and heri-
tage sector.

Previous multispectral imaging capture projects,
applied to specific examples of texts of historical
importance, have concentrated on recording docu-
ments in their current state (generally once import-
ant features are illegible). Here, we investigate best
practice in the multispectral imaging of heritage
material by imaging a parchment document before
and after a series of degradation processes, allowing
us to assess the effectiveness of image processing
algorithms to recover information from degraded
documents. This gives us a unique platform for
evaluating the quality of recovered images, and
allows us to assess the performance of image pro-
cessing algorithms for analysis of these images. We
propose a method for objectively comparing images
of degraded documents, and develop a method for
indicating which image processing methods are
most appropriate for recovering text which has
suffered from specific types of damage. In addition,
at a time when ‘critical making’6 is being much dis-
cussed in the Digital Humanities, we propose that
our approach to ‘critical destruction’ demonstrates
the importance of adopting quantitative approaches
when undertaking Digital Humanities research.

4 Method

The evaluation of image quality and the perform-
ance of methods which produce images of cultural

heritage documents is a complex and challenging
task (MacDonald and Jacobsen, 2006), and there
has been little attempt to evaluate multispectral
image quality previously. Partly this is because
‘quality’ is ill-defined. Here, we are able to introduce
a new, objective definition of ’quality’, namely, the
amount of shared information between an image of
the undamaged parchment and one of the recovered
text. This is explored more fully in section 4.4.
Existing multispectral image data are often particu-
lar to an individual document, and the success or
failure of analysis is determined by the subjectively
perceived legibility of the writing (Easton et al, 2003;
Attas, 2004; Knox, 2008). In order to assess multi-
spectral image processing methods objectively, it is
necessary to acquire data under controlled condi-
tions: capturing multispectral images of a manu-
script before any degradation occurs, and then
capturing images of the manuscript after degrad-
ation. These two sets of images enable evaluation
of the image processing methods and their perform-
ance on a real degraded document7. Naturally, this
is impossible to do with historical text which has
already been degraded (and no curator would allow
us to degrade a primary historical text of any
importance), but it is possible to adopt an experi-
mental approach in which a real manuscript is
deliberately degraded in a rigorously controlled
fashion, and its corresponding deterioration docu-
mented via multispectral imaging. This allows us to
quantitatively and objectively compare the recovery
of text from the degraded manuscript and to

Fig. 2 Multispectral detail of a single feature from our sample 0602R captured using a monochrome camera. Note the
variation in intensity and contrast of the writing and ink across the imaged wavelengths. It can be observed how,
initially, the ink gains contrast slightly, with a darker background in the shorter wavelengths. Around the longer
wavelengths in the visible spectrum and into the near-infrared, the contrast suddenly drops quite significantly, ren-
dering the contrast in the 900 nm image almost null

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 5 of 22


evaluate the success of multispectral imaging and its
related processing approaches when faced with spe-
cific documentary damage.

Such an experimental approach is common in
the field of conservation: the changes that parch-
ment (or more specifically, its collagen fibres)
undergo have been studied when subjecting samples
to heat (Chahine, 2000), ultraviolet light (Meghea et
al., 2004) even open flame (Giurginca et al., 2009)
and chemical solutions (Dobrusina and Visotskite,
1994). Other methods, such as optical coherence
tomography (Góra et al., 2006), x-ray fluorescence,
optical fluorescence (Dolgin et al., 2007), and x-ray
diffraction (Kennedy and Wess, 2006), have been
used to identify the state of degradation of parch-
ment: however, ours is the first known study to
focus on the macroscopic effects of degradation
agents on the legibility of primary historical texts
in order to understand how we can most effectively
use multispectral methods.

4.1 Degraded and Degrading Text
We chose to focus on parchment documents for our
study, given that parchment remains the primary
medium of large quantities of culturally important
documents in archives, museums, libraries, and pri-
vate collections. A durable, stiff material, parchment
is made of animal skin and consists of structured
collagen fibres, but is highly sensitive to changes in
humidity, and is endangered by biological, thermo-
chemical, and mechanical damage (Reed, 1972;
Clarkson, 1992; Larsen, 2007). Through consult-
ation with conservators and archivists, we identified
twenty methods of degradation that commonly
affect historical parchment material, changing its
physical characteristics at both microscopic and
macroscopic levels (Vnouček, 2007). These included
both physical and chemical agents to mimic the
kinds of damage that parchment documents can
be expected to incur during their lives, from techno-
logical mistakes during production, to improper
use, unsuitable storage conditions, disasters, and
natural ageing (Table 1). The damage agents were
selected so as to affect not only the properties of the
parchment, but also the legibility of text in various
ways, for example, shrinking or otherwise deform-
ing the parchment, and obscuring or effacing the

writing via physical, chemical, or biological reac-
tions and stains (Giacometti et al., 2012).

4.2 The Parchment Prepared
An eighteenth-century manuscript was donated for
our experiment from London Metropolitan
Archives8. The document was an assignment of
property which had been deemed to hold no histor-
ical or scholarly value, and had been de-accessioned
from their collection prior to our request for parch-
ment material in accordance with The National
Archives guidance on deaccessioning and disposal
(The National Archives, 2015). The manuscript
was written in iron gall ink on prepared animal
skin, measured approximately 70�70 cm, and was
composed of two large leaves (Fig. 3), which were
folded in thirds both horizontally and vertically.
Both leaves had red margin guidelines and a blue
seal glued outside the left margin. The outer leaf
contained a large stamp on the top left corner and
a fold trapping the inner leaf with red wax. Both
leaves had writing on the recto covering most of
the area of the leaf enclosed by the red margin.
The outer leaf had a written section on its verso,
detailing the date and contents of the document.
The recto of the manuscript corresponded with
the flesh side of the parchment, the verso with the
hair side. Apart from some signs of wear and tear,
especially around the folds of the text, it was in
overall good condition. Twenty-three 8�8 cm flat
square sections were cut from the parchment as
samples for this research: each sample contains writ-
ten text.

Each of the 23 samples was selected from a flat
area of the manuscript, where the writing covered
the surface of the recto (flesh side), and the verso
was empty and without blemishes. Folds and marks
of any kind were avoided. There were three excep-
tions: two of the samples were cut so as to contain
writing on both the recto and verso, and one sample
was cut from a folded area. The old fold sample and
one of the samples with writing on both sides were
kept as controls. The samples were imaged, then a
series of treatments were applied to damage the
twenty samples, as described in Table 1; three were
left untreated and kept intact as controls (giving a
total of 23 samples). The samples were then

A. Giacometti et al.

6 of 22 Digital Scholarship in the Humanities, 2015


reimaged, producing image sets acquired before and
after damage.

4.3 The Parchment Imaged
Multispectral images were acquired of every sample
on two separate sessions in order to capture the
samples in an untreated and treated state. The ima-
ging station and equipment were not moved be-
tween sessions. Each sample was imaged under

three modalities which represent different ways of
illuminating and capturing the images (MacDonald
et al., 2013). The Nikon camera is a consumer
camera which has a high pixel resolution and
acquires colour images; the monochrome camera
is a scientific camera which acquires greyscale
images only (making it more convenient for multi-
spectral imaging though a filter) and has increased
sensitivity to the infrared, but has a lower pixel

Table 1 Summary of different types of degradation (in small capitals), giving the reason for the degradation, the

circumstances in which it might occur, and the type of degradation

Degradation

reason

Circumstances Mechanical

damage

Chemical, biological,

or environmental

damage

Damage by

extraneous

substances

Technological

mistakes

During manufacture Lime solution,

acidity, finishing

Scraping Hydrochloric Acid,

Calcium Hydroxide

Oil

During writing Ink acidity Sulphuric Acid

During binding Unsympathetic

binding

Mechanical

Damage

Storage Environmental changes Temperature, humidity Heat, Desiccant, Mould

Exposure to light Visible light, UV UV Light, Controls

Pollutants, dirt Chemical reactions Smoke, Sulphuric Acid,

Controls

Natural disasters Fire, smoke, water Heat, Smoke, Water

Biodegradation Micro-organisms,

insects, rodents

MOULD

Mechanical destruction Rubbing, folding Mechanical

Damage

Use Erasures, changes to text Corrections,

re-usage

Scraping Iron Gall Ink

Mishandling, misuse Mechanical

Damage,

Scrunching

Accidents Spillage Blood, Red Wine,

Black Tea,

Iron Gall Ink, Water

Blood, Oil,

Red Wine,

Black Tea,

Aniline Dye,

Iron Gall Ink,

Indian ink

Repairs Historical, conservation

treatments

Water, Sodium

Hypochlorite

Rebinding Unsympathetic

binding

Mechanical

Damage

Palaeographical and

conservation

experiments

Palimpsest text

recovery,

bleaching

Hydrochloric Acid Oil

Reformatting,

digitisation

Mechanical

Damage

UV Light

Natural ageing Controls

In the table are also highlighted the kinds of degradation that naturally occurred to the manuscript during our experiments; these are

identified by the keyword Controls.

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 7 of 22


resolution. Using both reflective and transmission
imaging allowed both surface and deeper features
to be detected.

(1) Colour reflective imaging. This maximises
sensitivity to surface features on the parchment.
A Nikon D200 camera was mounted on a copy-
stand with white tungsten-halogen incandescent
lighting set at 458 (these emit from 400 to
1,000 nm, with peak power from 500–750 nm;
see Fig. 4). Sixteen bandpass filters with centre
wavelengths 400–700 nm and bandwidth 20 nm
(Unaxis Optics, USA) were fixed in turn to the
front of the camera lens. The camera was fitted
with a Nikkor 105 mm f2.8 macro lens and set
to an aperture of f8 throughout.
(2) Colour transmission imaging. This used the
same camera, but the light source was a light-
box beneath the parchment, ensuring that only
light which had passed through the parchment
was detected, thereby increasing sensitivity to
deeper regions of the parchment, and avoiding
specular reflections. Its fluorescent lamps had a
narrower bandwidth and did not provide mea-
sureable power above 800 nm (which did not
affect the colour imaging, because our longest
wavelength filter was 700 nm). The same sixteen
bandpass filters and the same camera lens were

used. We also imaged each sample with no
filter, giving 17 captures per sample.
(3) Monochromatic reflective. In order to
detect light in the near infrared, we used a
monochrome camera (Kodak Megaplus 1.6i)
which did not have an infrared cutoff filter.
This enabled us to investigate longer wave-
lengths using five additional infrared bandpass
filters from 750–950 nm, centred every 50 nm.
This camera has a lower spatial resolution than
the Nikon (1,534�1,024 pixels compared to
3,872�2,592 pixels) and a lower bit depth (12
bits compared to 16 bits). The lens was a
Nikkor 50 mm f2, which was set at a constant
aperture of f/5.6 for all images. The sampling
resolution on the surface of the parchment was
approximately 12 pixels/mm (300 dpi).

A total of 2,800 images were acquired in two ima-
ging sessions, requiring careful data management
processes to name each sample and record metadata
about the process (Table 2; Giacometti, 2013).

4.4 Image Processing
The analysis of images created by our experimental
approach is important in several ways, as we can use
various image processing algorithms to produce

Fig. 3 A diagram of the location of the samples cut from both leaves of the iron gall ink on parchment manuscript.
Each sample is 8 cm square, giving an overall impression of the size of the original parchment

A. Giacometti et al.

8 of 22 Digital Scholarship in the Humanities, 2015


estimates of the original writing from the degraded
samples, and calculate how different our results are
from the untreated samples (this is demonstrated in
Fig. 6). There is a variety of image processing

methods suitable for this task, including K-means
clustering9, Principal Components Analysis
(PCA)10, Independent Components Analysis
(ICA)11, and Linear Spectral Mixture Analysis

Fig. 4 Imaging set up for colour reflective imaging. The camera is locked facing vertically downwards. The sample is
placed on the copystand over a piece of black card under a sheet of anti-reflective glass. The process of acquiring the
images involved manually exchanging each filter in front of the camera lens, causing small movements of the camera
each time (which meant the resulting images required further image registration, see Giacometti, 2013)

Table 2 Imaging details per session imaging system and lighting modality, and photograph counts

Session Modality Samples Captures Number of images

Before degradation Colour reflective 25 17 425

Colour transmission 25 17 425

Monochromatic reflective 25 22 550

Session total 1,400

After degradation Colour reflective 25 16 425

Colour transmission 25 16 425

Monochromatic reflective 25 21 550

Session total 1,400

Total 2,800

For each sample, one photograph was taken with each of 16 filters, plus one photograph without a filter.

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 9 of 22


(LSMA)12, although early on in this research we
demonstrated the limited success of K-means clus-
tering for our application (Giacometti, 2013). We
therefore processed the multispectral images of each
treated sample using three different methods: PCA,
ICA, and LSMA13.

First, data were cleaned and prepared: each
monochrome image was cropped to a square area
of 980�980 pixels, and the stack of images repre-
senting the same sample were co-registered so that
each pixel represents the same coordinates on the
physical sample. This corrected for movement in the
camera as lens filters were changed, and also for
some degradation treatments which caused the sam-
ples to shrink or curl14. We used a two-step regis-
tration process. An initial linear step preformed
gross realignment and a subsequent non-linear
step provided non-rigid registration (Giacometti,
2013). Both steps were implemented using software
called NiftiReg (Modat et al., 2010). This resulted in
a series of images of the treated sample which could
be directly compared pixel-by-pixel to images of the
untreated sample. The three image analysis proced-
ures were applied to these co-registered images. The
output of these consisted of a series of individual
images, each of which was intended to show one
feature of the sample, for example we might have
four images or ‘layers’ representing two inks, a stain,
and the underling parchment.

A comparison of the recovered images was then
performed. Our aim was to be as objective and
quantitative as possible, which was made difficult
because the layers obtained from PCA and ICA
show degrees of correlation rather than similar
intensities. For example, ink might appear as dark
on a light background in the sample, but the layer
representing ink might appear as a light pattern on a
dark background. A straightforward similarity
measure such as least squares similarity would give
a poor result in this case even if the ink had been
perfectly identified. We therefore used an approach
known as ‘mutual information’ which is a quanti-
tative measure of the amount of information shared
between two images, independent of their colour or
intensity (Wells et al., 1996; Panagiotou, 2010)15. In
our case, a successful identification of the text would
give a high mutual information score if the patterns

were correctly identified, irrespective of the colour
assigned to the recovered layer.

4.5 Controlled Dmage
We describe three of the twenty methods of destruc-
tion in more detail below. These three examples
were chosen to illustrate different classes of
damage: one where the ink was physically removed
(and therefore no longer remains on the document);
one where the ink remains in its original form but
has been obliterated by a stain; and finally one
where the ink remains but has been chemically
altered by a bleaching agent.

4.5.1 Scraping

Forcibly removing the surface layer and the visible
writing off the parchment with pumice stone16 was
a commonly employed method to re-use parchment
(Diringer, 1953; Netz and Noel, 2007), and therefore
is a common problem for which multispectral ima-
ging may help in recovering the erased text. We
divided our sample into three areas, leaving the
top area untreated, the middle area scraped gently,
and the bottom area scraped thoroughly for a longer
period. The scraping was undertaken by beating a
pumice stone until it became powder, and, using a
piece of cotton wool, scraping the parchment with
this powder, in circular motions.

4.5.2 Blood

Bloodstains are not uncommon in historical docu-
ments (Wechsler, 1952), and historical texts can
even be written in blood (Gurkina and Rebrikova,
2001; Kieschnick, 2001). Moreover, blood looks
similar to iron gall ink, as both haemoglobin and
iron gall are rich in the same colour-carrying iron
oxides, which contributes to the difficulty of separ-
ating ink from bloodstains. In criminology, multi-
spectral imaging has been found of assistance when
dating blood stains at crime scenes (Edelman et al.,
2012). We stained our parchment with human
blood (obtained as expired blood from the UCLH
Blood Transfusion Unit), until it fully penetrated
the parchment. The excess blood was removed
with blotting paper.

A. Giacometti et al.

10 of 22 Digital Scholarship in the Humanities, 2015


4.5.3 Sodium hypochlorite

Sodium hypochlorite is a strong alkaline substance
and a common bleaching agent. Reports of experi-
ments with both paper and parchment describe the
bleaching effects of both the parchment turning
pink and the iron gall ink fading (Smith, 2012).
There are also anecdotes of unscrupulous bleaching
of parchment by curators and conservators in an
attempt to read them (Blagden, 1787; Fuchs,
2003). In our method, 10–15% sodium hypochlorite
(NaOCl) was diluted in 40 mL of de-ionised water
(pH-5.5) until a pH of 13.0 was reached. This solu-
tion was then applied to the parchment.

5 Results

We describe the results of imaging our three meth-
ods of destruction in more detail below. We
concentrate here on results obtained from mono-
chromatic reflective imaging, as the analysis of
these images has produced useful outputs. See
Giacometti, 2013 for further research on our other
imaging modalities.

5.1 Example 1: Scraping
Figure 5 shows the initial sample, a photograph of
the sample after scraping and the best recovered
image. The scraping has rendered the affected
areas illegible to the naked eye (centre of Fig. 5).
There are traces of text remaining on the middle
third of the sample, where the scraping was lighter
but the bottom third has very few marks where the
writing used to be. On the top third, where a single
word was removed, there is a darker mark on the
parchment, but no traces of the word can be seen.
The degradation to the integrity of the parchment is
also visible in the samples: the parchment has
become physically thinner in the scraped areas.

Figure 6 shows the recovered component images
from the sample shown in Fig. 5. The first row of
Fig. 6 shows the four largest principal components
of the images of the untreated samples. It can be
seen that the pattern of the text is recovered suc-
cessfully, but the intensity is not faithfully recov-
ered—the text is white on black whereas in the
original sample (see Fig. 5), it is black on white.

This illustrates the effect described in section 4.4
and the necessity of a technique like mutual infor-
mation which is sensitive to the overall patterns but
not the absolute values to compare the images
quantitatively. Each subsequent row displays four
registered recovery images resulting from one of
the three image processing algorithms (from rows
2–4, PCA, ICA, and LMSA, respectively).

The mutual information between images of the
damaged and undamaged samples ranged from zero
where there was no information shared between a
pair of images to 0.247, which was the highest quan-
titative similarity and which occurred between
NO-PC1 and SC-ICA3 in Fig. 6 above17.

In this sample, then, the treatment has signifi-
cantly affected the writing, but there is information
that can be recovered using multispectral imaging,
and we can visualise some recovery of the writing,
while quantitatively indicating the success of ICA to
achieve the best results.

5.2 Example 2: Blood
On our sample section shown in Fig. 7, the blood
treatment has obscured the writing on the bottom
half of the sample. The bloodstains are of a dark
brownish colour. The lack of contrast between the
stained parchment and the ink renders the writing
almost illegible. The treatment has also changed the
geometry of the sample, shrinking it slightly. This
effect has been corrected in the recovery image by
co-registering it with the image of the sample before
the treatment shown in Figure 8.

In this case, the affected area of the sample that
holds ink is stained, and any interpretation of the
sample is difficult on the unprocessed images. The
highest mutual information measure was 0.254 and
occurred between the unprocessed image and
BL-PC2. The mutual information between the
unprocessed image and BL-LS3 was almost equally
high (0.240).

Our systematic data gathering enables further
comparison. We can look at the intensity of each
pixel as a function of wavelength (Fig. 9). The top
part of Fig. 9 shows the untreated parchment.
The band of pixels with high reflectance at all
wavelengths corresponds to the background, or
the parchment substrate. A smaller, less clustered

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 11 of 22


group which corresponds to the writing is darker at
shorter wavelengths, but has similar reflectance to
the parchment around infrared. However, after the
treatment is applied (bottom of Fig. 9), a second
band of pixels corresponding to the blood appears
and follows a similar path to that of the ink, with
the response across the spectrum appearing similar.
This demonstrates that the wavelength dependence
of blood is similar to that of much of the ink and
indicates why the blood makes the ink difficult to
read (see also the spectral reconstruction in
MacDonald et al., 2013).

In this case, the recovery estimate shows a clear
trace of the writing with identifiable letters, even
though the image of the treated sample looks, to
the human eye, too obscured to be legible. Figure 9
shows that the spectrum of blood is similar to that of
ink, so even multispectral imaging might not be com-
pletely successful. However, we have shown that PCA
is capable of enhancing text, and that PCA is the first
image processing method that should be tried when
faced with such damage. LMSA also performed well
and should be considered if good knowledge of the
spectra of the blood and ink are available.

5.3 Example 3. Sodium Hypochlorite
The writing on the sample treated with sodium
hypochlorite has become very faint (see the
bottom part of the central image in Fig. 10). Both
the parchment and the writing have become lighter.
The areas where the writing has remained visible
appear to be where there are stronger ink marks,

suggesting that the ink penetrated deeper into the
structure of the parchment. During the application
of the treatment, the parchment started to lose
structural integrity and some small particles sepa-
rated from the sample. The treatment was stopped
earlier than planned because of this, as the sample
needed to be preserved intact for imaging purposes.
After drying, the sample became stable again.
However, it remained in a fragile state, lighter to
the touch, more transparent, smoother, and more
flexible than before treatment.

In this case (Fig. 11), the maximum mutual in-
formation was 0.294 and occurred for SH-PC1. The
second highest occurred for SH-IC4 (0.237), Even
the best recovered estimate (right image in Fig. 10)
appears to be very similar to the image of the treated
sample (middle image of Fig. 10), and our methods
have not been able to extract any further informa-
tion from the text. Multispectral imaging cannot
recover text in every example of damage, and our
systematic investigations indicate where it is worth-
while using multispectral imaging, and where
it simply will not recover any additional informa-
tion from damaged and deteriorated texts. Our
research indicates that text damaged by bleaching
agents such as sodium hypochlorite might not gen-
erate useful results when imaged multispectrally:
this could inform cost analysis on when it may, or
may not be, worth imaging particular texts which
have indications of this sort of damage.

Further results and discussion regarding all of
our degradation techniques, and the results from

Fig. 5 Sample O609R (left to right) before treatment, after treatment (scraping), and the best possible recovery estimate
using our methods (third independent component)

A. Giacometti et al.

12 of 22 Digital Scholarship in the Humanities, 2015


both capture and processing are catalogued in
Giacometti (2013), providing a framework in
which to understand the successes and limitations
of multispectral imaging and the image processing
algorithms used.

6 Discussion

Our approach—acquiring multispectral images of
historical parchment from a set of samples before
and after they were submitted to various forms of

Fig. 6 Sample 0609R (scraping) image processing results by three different processing algorithms. The top row includes
PCA of the untreated samples, the next row the PCA of the treated samples. Row three indicates the results further
processed with ICA, row four shows LSMA: both are highlighting similarities between the original and the treated
samples

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 13 of 22


degradation—is novel. In this work, we attempted
to recover writing from multispectral images, whilst
objectively and quantitatively evaluating the effect-
iveness of image processing algorithms. Our ap-
proach successfully identifies the samples which
contain more mutual information shared with the
original text, and successfully ranks partial recovery
of information.

The effect of each of the twenty treatments on
both the parchment and the visibility of the writing
it carries varied significantly. In some samples in
which the writing has been rendered unreadable by
the treatment, the writing can be recovered, including
aniline dye, oil, and blood. In some samples the writ-
ing is completely obscured or the parchment has
been severely affected and recovery is all but impos-
sible, including iron gall ink, India ink, and mould.
In most cases, however, the image processing algo-
rithms can extract more information from the multi-
spectral images of treated samples corresponding to
the writing than the human eye can see.

PCA outperformed ICA and LSMA as the image
processing means by which to produce accurate re-
covery estimates for almost all the samples (although
one of our examples shown above, the blood stained
fragment, was more successfully recovered with ICA).
This shows that there is not one approach or algo-
rithm which suits all types of document degradation,
and that the specific condition of a document affects
the processing methods which should be used on
resulting images. However, PCA is a standard pro-
cessing algorithm which appears to be accurate and

robust in this application, and is therefore recom-
mended to be used as the first in a range of processes
when analysing multispectral images. Further pro-
cessing may yield improved results.

Our research depends on deliberately degrading
square samples cut from a real historical iron gall
ink manuscript on parchment. This degradation was
necessary to model the type of documentary damage
commonly seen in historical documents, and to
understand how they affect the reading and inter-
pretation of writing, both before and after multi-
spectral imaging of the samples. The critical
destruction is therefore a core part of our method,
as it is central to a complete understanding of the
effectiveness of multispectral imaging on primary
historical texts.

However, our approach does not provide system-
atic information about any single degradation cause.
There is much research to follow on from this, given
that we have shown that using carefully prepared
historical evidence can provide an effective frame-
work for the evaluation and analysis of the applica-
tion of multispectral imaging. Additional analysis of
our data is possible, and we have already carried our
further research into the estimation of spectral sig-
natures of the materials present in the documents
from the collected multispectral images
(MacDonald et al., 2013). We envisage that the
dataset has the potential to become an invaluable
asset for libraries and archives, research in conser-
vation, and various problems in image and signal
processing, and have made all of the data generated

Fig. 7 Sample I208R (left to right) before treatment, after treatment (blood), and the best possible recovery estimate
using our methods (second principal component)

A. Giacometti et al.

14 of 22 Digital Scholarship in the Humanities, 2015


from this project available for use by others18. Our
dataset provides physical information of how parch-
ment reacts to various forms of degradation, and
also provides documentation on acquisition, and
will provide a resource for future research (reducing
the need for experimentation on valuable primary

historical texts). Our next step will be to carry out a
similar process concentrating solely on systematic-
ally reproducing different degrees of an individual
type of degradation (such as water damage or smoke
and heat) to provide further information to help
both future conservation and digitisation efforts.

Fig. 8 Sample I208R (blood) image processing results by processing algorithm. The top row includes PCA of the
untreated samples, the next row the PCA of the treated samples. Row three indicates the results processed with ICA,
row four shows LSMA: both highlight similarities between the original and the treated samples

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 15 of 22


Fig. 9 Sample 1208R (blood) spectral intensity against wavelength. Above: before treatment. Below, after treatment

Fig. 10 Sample O605R (left to right) before treatment, after treatment (sodium hypochlorite), and the best possible
recovery estimate using our methods (first principal component)

A. Giacometti et al.

16 of 22 Digital Scholarship in the Humanities, 2015


7 Conclusion

As multispectral imaging becomes more frequently
used in the cultural and heritage sectors, it is im-
portant to understand the framework which under-
pins its application to the capture and analysis of

primary historical texts. Our research has provided a
systematic methodology for the continuing study
and evaluation of the techniques involved in the
analysis and processing of multispectral images of
degraded cultural heritage documents, and a basis
for further testing and development. Understanding

Fig. 11 Sample O605R (sodium hypochlorite) image processing results by processing algorithm. The top row includes
PCA of the untreated samples, the next row the PCA of the treated samples. Row three indicates the results further
processed with ICA, row four shows LSMA

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 17 of 22


the most efficient way to apply these techniques to
damaged and abraded texts is central to ensuring
that the images created when using multispectral
imaging—which becomes evidence to be used by a
range of scholars including historians, palaeog-
raphers, and papyrologists—can be trusted by
researchers, whilst also making the most efficient
use of resources. Our systematic approach provides
a framework for the analysis of deteriorated docu-
ments using multispectral techniques.

Carrying out this type of interdisciplinary
research facilitates a deeper understanding of the
artefacts, multispectral imaging, and image process-
ing methods. Specifically, it provides a methodology
for the continuing study of the techniques involved
in the analysis and processing of multispectral
images of degraded cultural heritage documents,
and a framework for further testing and develop-
ment. It has required input from conservators,
digitisation specialists, medical physicists, engineers,
and computer scientists, archivists all collaborating
in a Digital Humanities project where aspects of
computing are advanced as much as our under-
standing of a process that can be useful for huma-
nities scholars. Our unique methodology, where the
destruction of a historical text is necessary to ac-
quire experimental data for evaluation, can now
be used to evaluate a process for reading other,
more valuable, historical texts. Our combined crit-
ical approach to a developing technology allows us
to advise and steer the application of multispectral
techniques to primary historical texts.

Funding

This work was supported by the Engineering and
Physical Sciences Research Council [grant number
EP/F01208X/1]. We would like to thank London
Metropolitan Archives for donating the parchment
manuscript which allowed us to carry out this
research.

References
Attas, E. M. (2004). Enhancement of document legibility

using spectroscopic imaging. Archivaria, 57: 131–46.

Balas, C., Papadakis, V., Papadakis, N., Papadakis, A.,

Vazgiouraki, E. and Themelis, G. (2003). A novel

hyper-spectral imaging apparatus for the non-

destructive analysis of objects of artistic and historic

value. Journal of Cultural Heritage, 4(S1): 330–7.

Barnett, T., Chalmers, A., Diaz-Andreu, M., Ellis, G.,

Longhurst, P., Sharpe, K. and Trinks, I. (2005). 3D

laser scanning for recording and monitoring rock art

erosion. International Newsletter on Rock Art, 41: 25–9.

Baumann, R., Porter, D. C. and Seales, W. B. (2008).

The use of Micro-CT in the study of archaeological

artifacts. 9th International Conference on NDT of Art.

Jerusalem, Israel.

Blagden, C. (1787). Some observations on ancient inks,

with the proposal of a new method of recovering the

legibility of decayed writings: by Charles Blagden, M. D.

Sec. R. S. and F. A. S. Philosophical Transactions of the

Royal Society of London, 77: 451–7.

Bonanni, L., Xiao, X., Hockenberry, M., Subramani, P.,

Ishii, H., Seracini, M. and Schulze, J. (2009).

Wetpaint: scraping through multi-layered images.

Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems (CHI ‘09), New York,

NY: ACM, pp. 571–4. http://doi.acm.org/10.1145/

1518701.1518789. doi¼10.1145/1518701.1518789.

Chabries, D. M., Booras, S. W. and Bearman, G. H.

(2003). Imaging the past: recent applications of multi-

spectral imaging technology to deciphering manu-

scripts. Antiquity, 77(296): 359–72.

Chahine, C. (2000). Changes in hydrothermal stability of

leather and parchment with deterioration: a DSC study.

Thermochimica Acta, 365(1–2): 101–10.

Clarkson, C. (1992). Rediscovering parchment: the nature

of the beast. The Paper Conservator, 16(1): 5–26.

Conway, P. (2008). Best practices for digitizing photo-

graphs: a network analysis of influences. Proceedings

of IS&T’s Archiving 2008, Imaging Science and

Technology, Berne, 24–27 June.

Crowther, C., Nyhan, J., Tarte, S. and Dahl, J. (2014).

New and recent developments in image analysis: theory

and practice. Panel Session, Digital Humanities 2014.

http://dharchive.org/paper/DH2014/Panel-759.xml

Deegan, M. and Tanner, S. (2002), Digital Futures:

Strategies for the Information Age. London: Library

Association Publishing.

Diringer, D. (1953). The Book Before Printing: Ancient,

Medieval, and Oriental. Mineola, NY: Courier Dover

Publications.

A. Giacometti et al.

18 of 22 Digital Scholarship in the Humanities, 2015

http://doi.acm.org/10.1145/1518701.1518789
http://doi.acm.org/10.1145/1518701.1518789
http://dharchive.org/paper/DH2014/Panel-759.xml


DH2014 (2014). Plenary sessions, panels, long papers,

short papers, posters and workshops at digital huma-

nities 2014. http://dharchive.org/

Dobrusina, S. A. and Visotskite, V. K. (1994). Chemical

treatment effects on parchment properties in the course

of ageing. Restaurator, 15(4): 208–19.

Dolgin, B., Bulatov, V. and Schechter, I. (2007). Non-

destructive assessment of parchment deterioration by

optical methods. Analytical and Bioanalytical

Chemistry, 388(8): 1885–96.

Earl, G., Martinez, K. and Malzbender, T. (2010).

Archaeological applications of polynomial texture map-

ping: analysis, conservation and representation. Journal

of Archaeological Science, 37(8): 2040–50.

Easton, R. L., Jr., Knox and, K. T. and Christens-Barry,

W. A. (2003). Multispectral imaging of the Archimedes

palimpsest. Proceedings of the 32nd Applied Imagery

Pattern Recognition Workshop, San Jose, California,

pp. 111–6.

Easton, R. L., Jr., Knox, K. T., Christens-Barry, W. A.,

Boydston, K., Toth, M. B., Emery, D. and Noel, W.

(2010). Standardized system for multispectral imaging

of palimpsests. Proceedings of SPIE 7531, Computer

Vision and Image Analysis of Art 75310D.111, San

Jose, California.

Edelman, G., van Leeuwen, T. G. and Aalders, M. C.

(2012). Hyperspectral imaging for the age estimation

of blood stains at the crime scene. Forensic Science

International, 223(1-3): 72-7.

Everdell, N. L., Styles, I. B., Claridge, E., Hebden, J. C.

and Calcagni, A. S. (2009). Multispectral imaging of

the ocular fundus using LED Illumination. In

Depeursinge, C. and Vitkin (eds), Novel Optical

Instrumentation for Biomedical Applications IV, Vol.

7371. Proceedings of SPIE-OSA Biomedical Optics.

Optical Society of America, Munich, German.

Fuchs, R. (2003). The history of chemical reinforcement

of texts in manuscript - what should we do now? In

Fellows-Jensen, G. and Springborg, P. (eds), Care and

Conservation of Manuscripts 7: Proceedings of the

Seventh International Seminar Held at the Royal

Library, Vol. 7. Copenhagen, Denmark: Museum

Tusculanum Press.

Giacometti, A., Campagnolo, A., MacDonald, L.,

Mahony, S., Terras, M., Robson, S., Weyrich, T. and

Gibson, A. (2012). Documenting Parchment

Degradation via Multispectral Imaging. Proceedings of

BCS Conference on Electronic Imaging and the Visual

Arts (EVA), London, pp. 301–8.

Giacometti, A. (2013). Evaluating Multispectral Imaging

Processing Methdologies for Analysing Cultural Heritage

Documents. Ph.D. thesis, University College London,

forthcoming.

Giacometti, A., Terras, M. and Gibson, A. (2015).

Objectively evaluating text recovery methodologies for

multispectral images of palimpsests. International

Journal of Heritage in the Digital Era, 15th Issue dedi-

cated to Computer Vision in Cultural Heritage.

Giurginca, M., Lacatusu, I. and Miu and I. Petroviciu, L.

(2009). Parchment behaviour under extreme heat and

fire conditions, 13(3): 337–9.

Goltz, D.M., Cloutis, E, Norman, L. and Attas, M. (2007).

Enhancement of faint text using visible (420-720 nm)

multispectral imaging, Restaurator, 2007:11–28.

Goltz, D. and Hill, G. (2012). Hyperspectral Imaging of

Daguerreotypes. Restaurator: International Journal for the

Preservation of Library and Archival Material, 33(1):1–16.

Gonzalez, R. C. and Woods, R. E. (1993). Digital Image

Processing, Reading, Massachusetts: Addison-Wesley

Publishing.

Góra, M., Pircher, M., Götzinger, E., Bajraszewski, T.,

Strlic, M., Kolar, J., Hitzenberger, C. K. and

Targowski, P. (2006). Optical coherence tomography

for examination of parchment degradation. Laser

Chemistry, 68: 1–6.

Gurkina, S. and Rebrikova, N. (2001). Treatment of

parchment fragments of a Hebrew Bible. Restaurator,

22(3): 181–6.

Gray, R. and Neuhoff, D. (1998). Quantization. IEEE

Transactions on Information Theory, 44(6): 2325–83.

Hardeberg, J. Y., Schmitt, F. and Brettel, H. (2002).

Multispectral color image capture using a liquid crystal

tunable filter. Optical Engineering, 41(10): 2532–48.

Hartigan, J. A. and Wong, M. A. (1979). Algorithm AS

136: a K-Means clustering algorithm. Journal of the

Royal Statistical Society: Series C (Applied Statistics),

28(1): 100–8.

Heinz, D. and Chang, C. I. (2001). Fully constrained least

squares linear spectral mixture analysis method for ma-

terial quantification in hyperspectral imagery. IEEE

Transactions on Geoscience and Remote Sensing, 39(3):

529–45.

Hill, D. L. G., Batchelor, P. G., Holden, M. and Hawkes,

D. J. (2001). Medical image registration. Physics in

Medicine and Biology, 46(3): R1–45.

Hollaus, F., Gau, M. and Sablatnig, R. (2013). Acquisition

and Enhancement of Multispectral Images of Ancient

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 19 of 22

http://dharchive.org/


Manuscripts. Berlin, Germany: Kultur und Informatik:

Visual Worlds and Interactive Spaces, pp. 187–97.

Hughes, L. (2004), Digitizing Collections: Strategic Issues

for the Information Manager. London: Facet Publishing.

Hyvärinen, A., Karhunen, J. and Oja, E. (2001).

Independent Component Analysis. New York, NY: John

Wiley and Sons.

Information in Images (2014). Multispectral document

imaging. www.informationinimages.com/#!multispec-

tral-document-scanning/c1yhe

Jolliffe, I. T. (2002). Principal Component Analysis. New

York, NY: Springer-Verlag.

Joo Kim, S., Deng, F. and Brown, M. S. (2011). Visual

enhancement of old documents with hyperspectral

imaging. Pattern Recognition, 44(7): 1461–9.

Kennedy, C. J. and Wess, T. J. (2006). Chapter 4 The use

of X-ray scattering to analyse parchment structure and

degradation. In David, B. and Dudley, C. (ed.), Physical

Techniques in the Study of Art, Archaeology and Cultural

Heritage. Elsevier, pp. 151–72.

Kieschnick, J. (2001). Blood writing in Chinese

Buddhism. Journal of the International Association of

Buddhist Studies, 23(2): 177–94.

Knox, K. T. (2008). Enhancement of overwritten text in

the Archimedes Palimpsest. Proc. SPIE 6810, Computer

Image Analysis in the Study of Art, 681004 (29

February 2008); doi: 10.1117/12.766679.

Knox, K. T., Easton, R. L., Jr., Christens-Barry, W. A.

and Boydston, K. (2011). Recovery of handwritten text

from the diaries and papers of David Livingstone.

Proceedings of SPIE 7869, Computer Vision and Image

Analysis of Art II 786909, pp. 1–7.

Landgrebe, D. (1999). Information extraction principles

and methods for multispectral and hyperspectral image

data. In Chen, C. (ed.), Information Processing for

Remote Sensing. River Edge, NJ: World Scientific

Publishing Company, pp. 3–38.

Larsen, R. (2007). Introduction to damage and damage

assessment. In Larsen, R. (ed.), Improved Damage

Assessment of Parchment (IDAP): Assessment, Data

Collection and Sharing of Knowledge, 1st edn.

European Commision, Directorate- General for

Environment, pp. 17–21.

Luccheseyz, L. and Mitray, S. K. (2001). Color image

segmentation: a state-of-theart survey. Proceedings of

the Indian National Science Academy (INSA-A),

Delhi, Indian: National Science Academy, 67(2):

207–21.

MacDonald, L. and Jacobsen, R. (2006). Assessing image

quality. In MacDonald, L. (ed.), Digital Heritage,

Applying Digital Imaging to Cultural Heritage. Oxford:

Butterworth-Heinenmann, pp. 351–74.

MacDonald, L. W., Giacometti, A., Campagnolo, A.,

Robson, S., Weyrich, T., Terras, M. and Gibson, A.

(2013). Multispectral imaging of degraded parchment.

Computational Color Imaging, 4th International

Workshop, CCIW 2013, Chiba, Japan, 3-5 March

2013. Proceedings In Tominaga, S., Schettini, R.,

and Trémeau, A. (eds), Lecture Notes in Computer

Science, Vol. 7786, Chiba, Japan: Springer Berlin

Heidelberg.

Marengo, E., Manfredi, M., Zerbinati, O., Robotti, E.,

Mazzucco, E., Gosetti, F., Bearman, G., France, F. and

Shor, P. (2011). Development of a technique based on

multi-spectral imaging for monitoring the conservation

of cultural heritage objects. Analytica Chimica Acta,

706(2): 229–37.

Meghea, A., Giurginca, M., Iftimie, N., Miu, L., Viorica,

B. and Budrugeac, P. (2004). Behaviour to accelerate

ageing of some natural biopolymer constituents of

parchment. Molecular Crystals and Liquid Crystals,

418(1): 285–90.

Modat, M., Ridgway, G. R., Taylor, Z. A., Lehmann, M.,

Barnes, J., Hawkes, D. J., Fox, N. C. and Ourselin, S.

(2010). Fast free-form deformation using graphics pro-

cessing units. Computer Methods and Programs in

Biomedicine, 98(3): 278–84.

Netz, R. and Noel, W. (2007). The Archimedes Codex:

How a Medieval Prayer Book Is Revealing the True
Genius of Antiquity’s Greatest Scientist. 1st edn. Da

Capo Press, London.

Panagiotou, C. (2010). Information Theoretic

Regularization in Diffuse Optical Tomography. Ph.D.

thesis. London: University College London.

Peatross, J. and Ware, M. (2013). Physics of Light and

Optics. Provo: Brigham Young University

Ponto, K., Seracini, M. and Kuester, F. (2009). Wipe-off:

an intuitive interface for exploring ultra-large multi-

spectral data sets for cultural heritage diagnostics.

Computer Graphics Forum, 28(8): 2291–301.

Rapantzikos, K. and Balas, C. (2005). Hyperspectral ima-

ging: potential in non-destructive analysis of palimp-

sests. IEEE International Conference on Image

Processing, 2.

Ramsay, S. and Rockwell, G. (2012). Developing things:

towards and epistemology of building in the digital

humanities. In Gold, M. K. (ed.), Debates in the

A. Giacometti et al.

20 of 22 Digital Scholarship in the Humanities, 2015

www.informationinimages.com/#!multispectral-document-scanning/c1yhe
www.informationinimages.com/#!multispectral-document-scanning/c1yhe


Digital Humanities. Minneapolis: University of

Minnesota Press, pp. 75–84.

Ratto, M. (2011). Critical making: conceptual and mater-

ial studies in technology and social life. The Information

Society, 27(4): 252–60.

Reed, R. (1972). Ancient Skins, Parchments and Leathers.

London, UK: Seminar Press.

Salerno, E., Tonazzini, A. and Bedinin, L. (2007). Digital

image analysis to enhance underwritten text in the

Archimedes palimpsest. International Journal of

Document Analysis and Recognition (IJDAR), 9(2–4):

79–97.

Schuman, R. (2014). I tweeted a joke that started a big Ass

Ruckus: pan kisses Kafka. http://pankisseskafka.com/

2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-

ruckus/ (accessed 8 January 2014).

Senvaitenë, J., Beganskienë, A., Tautkus, S.,

Padarauskas, A. and Kareiva, A. (2005).

Characterization of histocial writing inks by different

analytical techniques. Chemija, 16(3–4): 34–8.

Smith, T. (2012). An evaluation of historical bleaching

with chlorine dioxide gas, sodium hypochlorite, and

chloramine-T at the Fogg art museum. Restaurator,

33(3–4): 249–73.

Tanner, S. and Bearman, G. (2009). Digitising the

Dead Sea Scrolls: Archiving 2009. Arlington, VA:

The Society for Imaging Science and Technology, pp.

119–23.

Terras, M. (2006a). Image to Interpretation: Intelligent

Systems to Aid Historians in the Reading of the

Vindolanda Texts. Oxford Studies in Ancient

Documents, Oxford University Press, Oxford.

Terras, M. (2006b). Disciplined: Using educational stu-

dies to analyse humanities computing. Literary and

Linguistic Computing, 21(2): 229–46.

Terras, M. (2008). Digital Images for the Information

Professional. London: Ashgate.

The National Archives (2015). Deaccessioning and dis-

posal: guidance for archive services. www.nationalarc-

hives.gov.uk/documents/Deaccessioning-and-disposal-

guide.pdf

Vnouček, J. (2007). Typology of the damage of the

parchment in manuscripts of the codex form. In

Larsen, R. (ed.), Improved Damage Assessment

of Parchment (IDAP): Assessment, Data Collection

and Sharing of Knowledge, 1st edn. European

Commission, Directorate- General for Environment,

Luxembourg, pp, 27–30.

Wells, W. M., Jr III., Viola, P., Atsumi, H., Nakajima, S.

and Kikinis, R. (1996). Multimodal volume registra-

tion by maximization of mutual information. Medical

Image Analysis 1.1, 1(1): 35–51.

Weingart, S. (2013a). Acceptances to digital humanities

2013 (part 1). www.scottbot.net/HIAL/?p¼35242 (ac-

cessed 25 April 2013)

Weingart, S. (2013b). Submissions to digital humanities

2014. www.scottbot.net/HIAL/?p¼39588 (accessed 5

November 2013).

Weingart, S. (2014). Acceptances to digital humanities

2014 (part 1). www.scottbot.net/HIAL/?p¼40695

(accessed 10 April 2014).

Wechsler, T. (1952). The origin of the so called dead sea

scrolls. The Jewish Quarterly Review, 43(2): 121–39.

Workman, J. and Weyer, L. (2007). Practical Guide to

Interpretive Near-Infrared Spectroscopy. CRC Press,

London.

Zolfagharifard, E. (2014). Does the Bible have secrets to

reveal? Scholars hope to restore hidden text in ancient

New Testament manuscript. http://www.dailymail.co.

uk/sciencetech/article-2752384/Scholars-hope-restore-

hidden-text-ancient-New-Testament-manuscript.

html#ixzz3LgzpVv7O

Notes
1 An analysis of submissions to the Digital Humanities

2014 conference carried out by Scott Weingart demon-

strates that work on text processing remains the core

focus of the Digital Humanities community (Weingart,

2013b), with an analysis of DH2014 acceptances indi-

cating that ‘Literary studies, text analysis, and text

mining still reign supreme’ (Weingart, 2014). This fol-

lows the same trends identified in Weingart’s analysis of

Digital Humanities 2013 acceptances (Weingart,

2013a). An earlier analysis of the most used words in

the ACH/ALLC Conference abstracts 1996–2005

(Terras, 2006b, p. 236) indicates that text was the

focus on this earlier work of the Digital Humanities

community.
2 At the Digital Humanities conference 2014, one of the

eight panels was devoted to image processing

(Crowther et al., 2014), and the program also contained

a range of short and long papers dealing with image

processing, optical character recognition, and the

search, retrieval, and navigation of high resolution

document image collections (DH2014, 2014).

Evaluating multispectral image processing methods

Digital Scholarship in the Humanities, 2015 21 of 22

http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/
http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/
http://pankisseskafka.com/2014/01/08/i-tweeted-a-joke-that-started-a-big-ass-ruckus/
www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf
www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf
www.nationalarchives.gov.uk/documents/Deaccessioning-and-disposal-guide.pdf
www.scottbot.net/HIAL/?p=35242
www.scottbot.net/HIAL/?p=35242
www.scottbot.net/HIAL/?p=39588
www.scottbot.net/HIAL/?p=39588
www.scottbot.net/HIAL/?p=40695
www.scottbot.net/HIAL/?p=40695
http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O
http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O
http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O
http://www.dailymail.co.uk/sciencetech/article-2752384/Scholars-hope-restore-hidden-text-ancient-New-Testament-manuscript.html#ixzz3LgzpVv7O


3 Other emerging techniques of interest to those aiming
to recover information from primary historical sources
include infrared or near-infrared imaging (Workman
and Weyer, 2007), and three dimensional imaging
such as micro-CT (Baumann et al., 2008), 3D laser
scanning (Barnett et al., 2005), and Reflectance
Transformation Imaging (RTI) (Earl et al., 2010).

4 These are the most popular ways to capture multispec-
tral images in the heritage sector, although the cost of
obtaining equipment can still be prohibitive for many

institutions to undertake this sort of analysis. At the
time of writing, a set of narrowband multispectral
filters retails in the region of £10,000 (and will also
require additional camera and lighting equipment to
be able to be used with it: this is the system we use in
this experiment). A full system for production and
capture of specific light wavelengths currently retails
for £80,000. Camera sensors that can select wave-
lengths automatically have been developed (Balas et
al., 2003), but these are not commercially available.
Relatively low cost scanners have been developed that
claim full multispectral capabilities, currently retailing
for £2,000 (Information in Images, 2014), but these
claims have not been verified by independent tests.

5 Although there are now over forty different guidelines
in existence which detail best practice in straightfor-
ward digitisation of cultural and heritage materials
(Conway, 2008), none of them has described ideal
approaches for the capture, analysis, and storage of
multispectral images of heritage material.

6 Ratto, 2011; Ramsay and Rockwell, 2012; Schuman,
2014.

7 The creation of virtual models, or ‘phantoms’, to
allow this comparison is also explored in detail in
Giacometti (2013) and Giacometti et al., (2015).

8 www.lma.gov.uk
9 K-means clustering is a method to separate data

points into a number (k) of clusters according to
underlying shared characteristics. For example, an
image showing parchment and two different inks
might be separated into k¼3 clusters, so that the
pixels representing the three ‘layers’ are identified sep-
arately. See Hartigan and Wong (1979), Gray and
Neuhoff (1998), and Luccheseyz and Mitray (2001).

10 PCA is a technique for decomposing a set of data into
its intrinsic variability, preserving the maximum vari-
ability of the data in fewer dimensions (Jolliffe, 2002).

In the ideal case, each of the principal components
would show one layer from the image.

11 ICA is designed to separate sources of signals from a
series of measurements (Hyvärinen et al, 2001).
Independent components are not ranked, and the
energy of each dimension is not preserved, or mean-
ingful. It behaves similarly to PCA but can give differ-
ent results. Again, we would aim that each of the
independent components shows a different layer
from the image.

12 LSMA decomposes multispectral image data into
layers of materials by using a priori knowledge of the
spectral signals of materials that are present (Heinz
and Chang, 2001). This requires knowledge of the ab-
sorption spectrum of each dye, which might not
always be available.

13 Further descriptions of these techniques and applica-
tions are available in Chapter 2 of Giacometti (2013).

14 Non-linear or non-rigid transformations are those that
affect one area of an image in a different way to other
areas (Hill et al., 2001), thus allowing compensation
when one side of the parchment has shrunk, etc.

15 The amount of information in an image can be for-
mally calculated as the entropy of the image. A blank
image has no information and has entropy¼0,
whereas a completely random image carries maximum
information (in that the value of one pixel cannot be
predicted by that of its neighbours) and therefore has
maximum entropy. The information shared between
two images can be given as the joint entropy which
increases as two images differ, because if the images
are different, one cannot be used to predict the other.
If the entropy H of an image X is H(X) and the joint
entropy of images X and Y is H(X,Y), then we can
define the mutual information I(X,Y) as the informa-
tion shared between two images, or equivalently their
similarity. Then, formally, I(X,Y)¼H(X)þ
H(Y)�H(X,Y).

16 The procedure sometimes involved softening the
parchment using a mixture of cheese, milk, and
lime, before proceeding to scrape the writing using a
knife or razor (Diringer, 1953).

17 Full data are available in Appendix C of Giacometti
(2013).

18 The DOI for this dataset is 10.14324/000.ds.1469099
19 The figures included in this paper were originally pub-

lished in Giacometti (2013).

A. Giacometti et al.

22 of 22 Digital Scholarship in the Humanities, 2015

www.lma.gov.uk