key: cord-0057702-jmvf5iba authors: Gergatsoulis, Manolis; Papaioannou, Georgios; Kalogeros, Eleftherios; Carter, Robert title: Representing Archeological Excavations Using the CIDOC CRM Based Conceptual Models date: 2021-02-22 journal: Metadata and Semantic Research DOI: 10.1007/978-3-030-71903-6_33 sha: 1068efc9cea7e59248742a02e9c553391a5e2831 doc_id: 57702 cord_uid: jmvf5iba This paper uses CIDOC CRM and CRM-based models (CRMarchaeo, CRMsci) to represent archaeological excavation activities and the observations of archaeologists during their work in the excavation field. These observations are usually recorded in documents such as context sheets. As an application of our approach (case study), we used the records of the recent archaeological excavations in Fuwairit in Qatar, part of the Origins of Doha and Qatar Project. We explore issues related to the application of classes and properties as they appear in the latest versions of the aforementioned models, i.e. CIDOC CRM, CRMarchaeo, and CRMsci. The proposed data model could be used as the basis to create an automated system for archaeological documentation and archeological data integration. CIDOC CRM and CIDOC CRM based models such as CRMarchaeo have been recently used to model archaeological work. Archaeologists excavate, observe patterns, collect finds, keep notes, and produce records (such as handwritten excavation notebooks, filled-in context sheets, photographs, sketches drawings.) CIDOC CRM and CRMarchaeo aim to aid their digital documentation. Can CIDOC CRM and CIDOC CRM based models sufficiently represent archaeological records? To what extent are they able to provide a framework to assist archaeological work, documentation and interpretation? We address these issues by working towards an automated CRM-based system to assist archaeologists in modeling excavation works and research. Real time digital documentation of data from excavations, integrated with other semantically described data, will help archaeologists to more effectively evaluate and interpret their work results. To this end, we have represented archaeological context sheets (first page of the two-page context sheet, see Fig. 4 ) from recent archaeological excavation works at Fuwairit in Qatar (2016-2018), part of the Origins of Doha and Qatar Project (ODQ), by successfully employing classes and properties of CIDOC CRM, CRMarchaeo and CRMsci models. In the last decade, CIDOC CRM-related research and work has been done to integrate archaeological data, given the need of documenting archaeological science [20] . The ARIADNE project [17] and its continuation ARIADNEplus project 1 have systematically attempted to integrate different European archaeological datasets by using CIDOC CRM and by developing the CRMarchaeo and CRMsci extensions. Other attempts involved the extensions CRMsci and CRMdig to document scientific archaeological experiments and results [20] or just the CIDOC CRM (without any of its extensions) in an effort to describe archaeological objects but without an evaluation of this approach [6] . The English Heritage has also developed a CIDOC CRM extension, the socalled CRM-HE 2 , to model archaeological concepts and their properties. To the same end, the STAR project (Semantic Technologies for Archaeology Resources) [2] investigated the suggested extension on archaeological data integration. Additionally, they proposed a semi-automatic tool for archaeological dataset mapping to CRM-HE [3] as well as an approach for archaeological data creation from grey literature semantic search [23] . In terms of describing archaeological excavation records, there is an approach similar to the one presented in this paper [12] . This approach focused on CRMarchaeo classes and properties to model data derived from the daily archaeological excavation notebooks. Data in the archaeological notebooks related to describing the timespan of the works in an archaeological trench, defining and establishing elevation points, measuring the depths of archaeological strata, addressing the trench's stratigraphy, recording the archaeological findings from the works in the trench, and publishing the results of excavation and the archaeological work. This work lies within the overall theme of integrating various types of cultural metadata and encoding them in different metadata schemas using CIDOC CRM. Approaches relate to mapping the semantics of archival description expressed through the Encoded Archival Description (EAD) metadata schema to CIDOC CRM [4] , semantic mappings of cultural heritage metadata expressed through the VRA Core 4.0 schema to CIDOC CRM [9, 10] , and mapping of the semantics of Dublin Core (DC) metadata to CIDOC CRM [14] . These mappings consider the CIDOC CRM as the most appropriate conceptual model for interrelations and mappings between different heterogeneous sources [11] in the information science fields. Archaeology is the study of past material remains, aiming to comprehend past human cultures. From fossils dating millions of years ago to last decade's fizzy drink cans, archaeologists try to discover evidence of past phenomena, cultures and societies. Archaeology lies within humanities and social sciences, but it can also involve other scientific disciplines, depending on the nature of discoveries [21] . Archaeological work is a process of continuous discovery and recording. Archaeological finds are preserved and stored for interpretation, study, and exhibitions. In terms of methodology, archaeologists work in: 1. recording visible remains of past human activity (i.e. buildings and ruins), 2. surveying the surface of an area to spot, report and collect artifacts (i.e. human-made objects, e.g. fragments of pottery, glass and metal objects) and ecofacts (i.e. natural remains deposited as a result of human activity, e.g. animal bones, seeds etc.), and 3. systematically excavatingthe ground to discover artifacts and ecofacts. In archaeological excavations, archaeologists remove layers of soil (strata) within well-defined and oriented trenches. As soil is removed, distinct concentrations of soil and artifacts are revealed. These are called contexts and are reported in the diaries of the archaeologists or via filling in 'context sheets'. Archaeological diaries and/or context sheets form the basis of documenting the excavation process and comprise the starting point for archaeological analysis and interpretation. CIDOC Conceptual Reference Model (CIDOC CRM) 3 , is a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. CIDOC CRM intends to provide a model of the intellectual structure of cultural documentation in logical terms. Several extensions of CIDOC CRM suitable for documenting various kinds of cultural information and activities have been proposed so far. CRMarchaeo 4 is an extension of CIDOC CRM created to support the archaeological excavation process and all the various entities and activities related to it, while the CRMsci (Scientific Observation Model) 5 is an extension of CIDOC CRM intended to be used as a global schema for integrating metadata about scientific observations, measurements and processed data in descriptive and empirical sciences such as biodiversity, geology, geography, archaeology, cultural heritage conservation and others in research IT environments and research data libraries. This work applies CIDOC CRM, CRMarchaeo and CRMsci to document archaeological data and reports, which will offer valuable experience concerning the documentation needs of these data. We test our approach by using archaeological data in Qatar. This research will, in turn, influence the process of further developing and refining these models. This work is based on CIDOC CRM version 6.2.7 (October 2019), CRMarchaeo version 1.5.0 (February 2020), and CRMsci version 1.2.8 (February 2020). The Origins of Doha and Qatar Project (ODQ) started in 2012 6 . It aims to investigate the history and archaeology of Doha, the capital of Qatar, and the other historic towns of Qatar, as well as the lives and experiences of their inhabitants. ODQ was run by University College London in Qatar (UCL Qatar) in collaboration with Qatar Museums (QM), funded by the Qatar Foundation through Qatar National Research Fund (QNRF), under grants NPRP5-421-6-010 and NPRP8-1655-6-064. Given the rapid development of Doha in the last few decades, which transformed the city from a pearl fishing town at the beginning of the 20th century [5] to a vivid modern capital city thanks to oil revenues since the 1950s [1, 7, 8] , ODQ employed a multidisciplinary methodology. This included recording of historical buildings, excavations, recording oral histories of local people, GIS analysis for pre-oil and early oil Doha [16, 18, 19] , archival research and study in historical documents on Doha's founding and growth. Preliminary Results have been publicly presented in Qatar and the world by the project leaders. The project has also produced educational material for schools in Qatar. From 2016 until 2018, ODQ expanded its works in Fuwairit, about 90 km north of Doha in Qatar, with recordings of historical buildings, excavations and surface surveys, as the area consists of a historic village with buildings of historical architecture, as well as rock art and inscriptions, and the archaeological site itself (the remains of a pearl-fishing town of the 18th-early 20th c. AD). Works included mapping/surveying, excavations, recording of historical buildings, archaeological surface survey in both Fuwairit and the neighboring Zarqa, and pottery analysis [15] . For the purposes of this paper, we used context sheets from the archaeological excavation works in Fuwarit during the first season (2016) and specifically from Trench 1. In Fig. 1 we see the representation in CIDOC CRM of the overall structure of the Origins of Doha Project. The process of layers (strata) of soil and debris laid on top of one another over time is called stratigraphy. Archaeologists and geologists are particularly interested in the stratigraphy of an area, as strata determine sequences of humanrelated or geological events. As a rule, when a stratum lies above another, the lower one was deposited first. Let's think of earth strata as layers in a chocolate cake. To make a cake, first we put the sponge base, then a chocolate cream layer, then another layer of sponge cake, then one more layer of chocolate cream, then the chocolate frosting, and last (but not least) a cherry on top. This is a sequence of cake-making events with the base being the earliest and the cherry being the latest event in the process. Archaeologists prefer to eat their cakes from top to bottom, from the cherry to the base! First, they define the contour of a specific space to excavate, which is usually a square or rectangular space of x metres by x metres. The excavation space is called an archaeological or excavation trench. Then they start to carefully and meticulously remove the top layer (stratum) of the trench, and they keep on excavating within this area stratum by stratum. The content of each stratum in the trench may include evidence of human activity, such as fragments of clay, glass and metal objects, roof tiles, bricks, fossils, remains of a fire, animal bones etc. These objects help towards dating the strata and interpreting past events that have formed the strata. When the excavation of a trench is finished, the sequence of excavated deposits and features can be arranged in a stratigraphic matrix according to their chronological relationship to each other, i.e. whether the events that created them occurred before or after each other. This matrix is also described as a Harris Matrix from the book on archaeological stratigraphy by E. C. Harris [13] . Usually, earth strata are not as straightforward as chocolate cake strata. Archaeological strata may contain formations such as post holes, pits, walls, and burrows which disturb natural layers but often indicate human activity and are the results of human behaviour. Archaeologists number each stratum and each feature (e.g. a built structure or pit cut) on the stratigraphic matrix, and they try to interpret past events by corelating strata and objects found within stata. Each stratum and each feature are called contexts. It is important to note that contexts do not always have direct stratigraphic relationships with others even if they are close to each other or likely to be contemporary. For example, two deposits which have built up on either side of a wall have no direct stratigraphic link, though they might both have a relationship with the same context below (e.g. the wall, which is stratigraphically below both deposits). In such cases the matrix branches. For every context, archaeologists fill in a context sheet described below. In CRMarchaeo, each context on the stratigraphic matrix is member of the class A8 Stratigraphic Unit. Stratigraphic units are related via the property AP11 has physical relation, further refined by the property of property AP11.1 has type. The type can be 'above', 'below', 'within', 'next to' or other, depending on the relation of a context with another context in the stratigraphic matrix. In Fig. 2 we can see a fragment of the stratigraphic matrix of Trench 1, while in Fig. 3 we see the representation of a part of this stratigraphic matrix in CRMarchaeo. Archaeologists working for the Origins of Doha Project, and therefore at Fuwairit, have used context sheets to record their excavation work in the archaeo- Fig. 3 . Representation of the fragment of the stratigraphic matrix appearing in Fig. 2. logical trenches. The context sheet is the report describing each context unheartened and, therefore, it is critical for archaeological research and interpretation. Every context sheet offers: -Reference information (site codes, trench and context numbers, relation to other contexts, date, names of archaeologists recording, related photo and drawing numbers) -Information on the context's soil deposit and its characteristics. -Information about finds in the context. -Space for archaeological interpretation. -Space for recording levels and an accompanying sketch (back sheet 7 ). In Fig. 4 we see the front page of a context sheet from the excavation of Trench 1. A context sheet is an instance of the class E31 Document. A context sheet documents (P70 documents) an instance of the class A1 Excavation Process Unit (in our example this instance is Excavation of Stratigraphic Unit 2). The relation between the context sheet and the excavation of the stratigraphic unit that it documents is expressed through the path (see Fig. 5 , in which the CRM representation of most of the fields of the context sheet appearing in Fig. 4 is depicted) : Reference Information: The field Site Code contains a code (an instance of the class E42 Identifier which identifies the project (ODQ in our case). This identifier is related to the Origins of Doha Project (instance of the class E7 Activity (see Fig. 1 ) through a path of the form: E7 Activity → P48 has preferred identifier → E42 Identifier Concerning the values of the fields Trench and Context Number, we observe that the trench appears as an instance of the class A9 Excavation (see Fig. 1 ) while Context Number appears as an instance (in our example Excavation of Stratigraphic Unit 2) of A1 Excavation Process Unit (see Fig. 5 ). These two instances should be related with the property P9 consists of through a path: A9 Excavation → P9 consists of → A1 Excavation Process Unit There are several fields of the context sheet which are represented in CRM by directly connecting the instance of A1 Excavation Process Unit with the instance of other CRM classes through appropriate properties. Concerning the items 1) Colour, 2) Compaction and 3) Composition, we observed that Colour and Compaction can be seen as properties of the material in Composition. These items are represented as follows: the value in Compaction can be regarded as an instance (silty sand in our example) of the CRMsci class S11 Amount of Matter which consists of (P45 consists of) an instance (sand in our example) of the class E57 Material. The values of the properties of these material are instances of the CIDOC CRM class E26 Physical Feature. Each feature is related to the material with the property P56 bears feature. An instance of the class E55 Type is also connected through the property P2 has type to each instance of E26 Physical Feature to denote the type of the feature (compaction or colour in our case). Concerning the Deposit Type field of the context sheet, it gets one of the values listed in the right side of the context sheet under the title Deposit Type. In our model this is represented by creating an instance of the class E55 Type with the selected value (collapse in our example) and connecting this instance to the corresponding instance of A2 Stratigraphic Volume Unit through the path: A2 Stratigraphic Volume Unit → P2 has type → E55 Type This specific instance of the A1 Excavation Process Unit (in our example Excavation of Stratigraphic Unit 2) was performed at a specific time, represented as an instance of E52 Time-span, and carried out by an instance of E21 Person. This information is modeled in CRM by the following paths: A1 Excavation Process Unit → P4 has time-span→ E52 Time-span A1 Excavation Process Unit → P14 carried out by → E29 Person This information is recorded in Initials and Date field of the Context Sheet. Information on the Sequence of Context with Relation to Other Contexts: The information on the sequence of context is depicted in the CIDOC CRM representation of the stratigraphic matrix (see Fig. 3 ). In this Figure we see that the context S.U.2 (i.e. the context described by context sheet of our example) is below the context S.U.1 and above the contexts S.U.3 and S.U.6. Notice that the instance S.U.2 of the class A8 Stratigraphic Unit coincide in both figures ( Fig. 3 and Fig. 5 ). The finds in the context can be represented as instances of the CRMsci class S10 Material Substantial. Each instance of this class is then related to the deposit of the stratigraphic unit in which it is contained through a path of the form: A2 Stratigraphic Volume Unit → AP15 is or contains remains of → S10 Material Substantial Reference Information (photographs, drawings, context volume): Each photograph taken or a drawing designed during the excavation process is an instance of the class E36 Visual Item. The photograph/drawing is related to the corresponding instance of the A8 Stratigraphic Unit CRMarchaeo class through the property P138 represents. To distinguish between photographs and drawings we relate to the corresponding instance of E36 Visual Item an appropriate instance of E55 Type (i.e. an instance whose value is either photo or drawing). To represent the content of the context sheet field Description, Comments, Preliminary Interpretation as well as the field Post Excavation Interpretation we use a set of paths of the form: A1 Excavation Process Unit → P140i was attributed by → S5 Inference Making → P2 has type → E55 Type where a specific interpretation is encoded as instance of S5 Inference Making while the corresponding instance of the class E55 Type (which may be one of the values 'Description', 'Comment', 'Preliminary Interpretation', 'Postexcavation Interpretation:Local Stratigraphic Phase', 'Post-excavation Interpretation:Pot Phase') describes the type of this interpretation. Concerning the field Context Same As it relates the current context (instance of A8 Stratigraphic Unit with another context (i.e. another instance of A8 Stratigraphic Unit) which has the same features as the current context. This relation is expressed with the following path: A8 Stratigraphic Unit → P130 shows features of → A8 Stratigraphic Unit Such paths can be added in Fig. 3 . This work has used CIDOC CRM and its extensions CRMarchaeo and CRMsci to represent archaeological work and assist archaeologists in documenting and managing archaeological and cultural heritage information. It also adds to the theoretical discussion on common grounds among humanities, computing, and information studies. We put emphasis on representing the contents on the first page of the two-page context sheets used by archaeologists in their systematic excavation works on archaeological trenches. As future work, we aim to extend the proposed model with the CRMba [22] classes and properties, to allow adding representations of architectural remains and their relations. Also, we will use CRMgeo to describe trench recordings of levels (the second page of the twopage context sheet) to complete the context sheet description. Another next step is to design an automated system for documenting excavation works. This will provide archaeologists with the capacity to document their work in the field (archaeological contexts, findings, interpretation) in real time and make the most of the system's data entry and information searching facilities as well as explore the reasoning capabilities of the relevant ontologies. Rediscovering the island: doha's urbanity. from pearls to spectacle Semantic technologies for archaeology resources: results from the STAR project Semantic interoperability in archaeological datasets: data mapping and extraction via the CIDOC CRM The semantic mapping of archival metadata to the CIDOC CRM ontology Sea of Pearls: Seven Thousand Years of the Industry that Shaped the Gulf CIDOC CRM-based modeling of archaeological catalogue data Mapping the growth of an arabian gulf town: the case of Doha Pearl towns and oil cities: migration and integration in the arab coast of the persian gulf Defining a semantic Mmapping of VRA Core 4.0 to the CIDOC conceptual reference model Mapping cultural metadata schemas to CIDOC conceptual reference model Query transformation in a CIDOC CRM based cultural metadata integration environment Describing and revealing the semantics of excavation notebooks Principles of Archaeological Stratigraphy Integrating dublin core metadata for cultural heritage collections using ontologies The Language of Ancient Pottery. An Analytical Study for the 18th -early 20th Century Pottery from the Site of Fuwairit, Qatar: Unpublished BA thesis DOHA-Doha Online Historical Atlas ARIADNE: a research infrastructure for archaeology DOHA-doha online historical atlas The origins of doha project: online digital heritage remediation and public outreach in a vanishing pearling town in the Arabian Gulf Documenting archaeological science with CIDOC CRM Archaeology: Theories, Methods, and Practice, 6th edn CRMba a CRM extension for the documentation of standing buildings Automatic metadata generation in an archaeological digital library: semantic annotation of grey literature