key: cord-0057699-n33v5ln5 authors: Marcondes, Carlos H. title: Implementing Culturally Relevant Relationships Between Digital Cultural Heritage Objects date: 2021-02-22 journal: Metadata and Semantic Research DOI: 10.1007/978-3-030-71903-6_13 sha: 3c271e0530b3ac36eb0df55074d3613787037b8f doc_id: 57699 cord_uid: n33v5ln5 A vocabulary of culturally relevant relationships – CRR - between cultural heritage objects in Library, Archive, and Museum (LAM) was proposed with the aim of interlinking digital collections using the Linked Open Data (LOD) technologies. The CRR vocabulary was proposed to be used by culture curators, teachers, historians, etc. to enable them to interlink such digital resources to provide a richer context and to reveal new senses of such resources. This paper aims at testing and evaluating the CRR vocabulary, as follows. Wikipedia articles about remarkable cultural heritage objects as the painting Mona Lisa and the book Dom Casmurro are used to create RDF triples where the heritage object is the subject, relationships of the CRR vocabulary are the predicates, and links found in each Wikipedia article to other LAM digital objects, or other Web resources are the objects of each triple. The RDF graphs thus generated are presented and discussed. The necessity of improvements in the proposed CRR vocabulary is outlined and suggestions of these changes are proposed. Since 2000's cultural heritage collections of LAM are being published on the Web. The Web catalog technology makes such collection information silos because, although held by different institutions, many of such collections are complementary. Many of them have thematic intersections or are related to other web resources as authorities, historic events, subjects, exhibitions, or to articles in Wikipedia and its sibling resources DBpedia and Wikidata. The emergence of Digital Humanities (Zeng 2019) poses new challenges to libraries, archives, and museums (LAM). It requires that digital cultural heritage objects -HO -be machine-processable. Some of these collections are now being published as structured data using LOD technologies. As more cultural heritage datasets are published according to LOD technologies the web of culture data [16] . The publishing of LAM digital collections according to LOD technologies will achieve their full potential as the published resources became structured interlinked and queried (Tim Berners-Lee 2006). Aiming at exploring the synergies between such collections, and between them and other web resources, a research project (Marcondes 2019 ) proposed a vocabulary of culturally relevant relationships -CRR -between HO with the aim of interlinking such collections using the LOD technologies. The CRR vocabulary used as sources and inspiration of its relationships cases suggested by culture curators (Marcondes 2019, 125) [15] and ICA (2017) RIC-CM among others (Marcondes 2019, 122) . Many relationships in such vocabularies are similar to the intended meaning of the CRR relationships that emerge from the cases suggested by culture curators which are the raw material for the development of the CRR vocabulary, but few of them have exactly the same meaning. A comprehensive description of the CRR vocabulary, including the meaning of each relationship, relationships with similar meaning in other vocabularies as Dublin Core, ICOM/CIDOC (2014) namespace/URI specification are detailed in the former cited paper; it was not included in the present paper due to its number of pages limitations. A table with the CRR vocabulary relationships is included in Sect. 3. There are several LAM projects using LOD technologies (Marcondes 2019, 123) but few of them interlink collections hosted by different institutions. Related work highlight the value of initiatives to interlink LAM data and enrich metadata. Such practice is being increasingly recognized as one that adds value to LAM data [17] , (Klein and Kyrios 2013), [2] , (Zeng 2019), (McKenna, Debruyne and O'Sullivan 2020), [1] . The aim o CRR vocabulary is to provide a tool to interlink and enrich of LAM data. The proposed interlink vocabulary is conceived as a tool to be used by culture curators in their work of contextualizing, commenting, evaluating, and make sense of HO, and improve their reuse for educational and cultural purposes. The CRR vocabulary now needs to be tested for its adequacy to interlink several HO and interlink them with other web resources, forming complex conceptualizations of events, works, agents, and themes of cultural relevance. This phase of the research aims at testing the proposed vocabulary in real cases of sets of HO for which the relationships between them are remarkable and recognized in the academic literature on culture. The cases related in this paper are one of several test rounds planned. This paper addresses the following questions. Are the proposed of CRR relationships complete and comprehensive to describe the different cases of culturally relevant relationships found/known in culture? Are the proposed relationships simple and intuitive so to be used by culture curators without special training? There are many difficulties to test CRR vocabulary. To our knowledge, there is no standard methodology for testing vocabularies As proposed the vocabulary was conceived to be used by culture curators in their work to interlink HO available throughout the web. The CRR vocabulary was conceived to be simple and intuitive to be used by culture curators without any special training. Such aims guided its development. Although conceived with such aims it is hard to summon up culture curator that understand the need for such vocabulary and are aware of LOD technologies and their potentials. In the face of this difficulty, we opt to use as test cases articles of Wikipedia about largely recognized examples of HO. Such Wikipedia articles are full of links to other related HO, thus constituting conceptualizations similar to those that may be developed in real cases of interlinked HO by culture curators. Such test methodology seems to be objective and verifiable, therefore adequate to this phase of the research. Cases of sets of digital HO with their interrelationships that are remarkable and recognized in the academic literature on culture were selected. The corresponding Wikipedia articles texts about such objects were used to identify other heritage objects linked (to simulate the CRR relationship) to the Wikipedia article, along with Agents, Concepts, Events/Processes, Time, and Places, which comprise the entities of the ontology proposed in this research project (Marcondes 2020, 133) . Such links are manually extracted from the Wikipedia article's text. Two cases of remarkable heritage objects are chosen, the painting Mona Lisa, https://en.wikipedia.org/wiki/Mona_Lisa, and the book Dom Casmurro, https://pt.wikipedia.org/wiki/Dom_Casmurro, by the Brazilian author Machado de Assis. Links within each article text are indicative of a possible CRR relationship with such entities. The entities identified were then tentatively interlinked using the CRR vocabulary, forming a conceptualization. For each Wikipedia article were developed two tables, one with all links found within the article introduction paragraph (such paragraph is "pasted" as it is in the original Wikipedia article) and the other with chosen links found within the remainder of the article (not exhaustive); these are representatives of typical links between the HO corresponding to the Wikipedia article and other HO as books, paintings, documents, or entities as Agents, Concepts, Events/Processes, Time, and Places. Results of each conceptualization are presented as several RDF (RDF Primer 2004) triples in a two columns table where the table title represents the subject of a triple, the 1 st column represents the predicate and the 2 nd column represents the object of each triple. Links used are not necessarily URI used in LOD since they are used just to demonstrate the interlinking features of the CRR vocabulary. Within the 2 nd column, when an appropriate CRR relationship is not identified, a text from the Wikipedia article is quoted. Within the 2 nd column when the triple object is a reference to a publication within the Wikipedia article, it is cited in the References with a note "(Wikipedia reference)". When there is not an adequate CRR relationship to interlink the subject and the object of a triple, the corresponding 1 st column of the Table is filled with the observation "There is not a foreseen CRR relationship". A table with all of the CRR vocabulary relationships extracted from Marcondes (2019) follows. When the vocabulary was conceived there was an intention to reuse relationships from other vocabularies. Most of the time this intention was not carried out because the concepts in the original vocabulary have a slightly different meaning or were not relationships. This is the case of the CRR relationships 0021 Created_by/0022 Creator. Such relationships are somehow similar to Dublin Core element dc:creator, but dc:creator (see http://purl.org/dc/elements/1.1/creator) is not a relationship. In such cases, similar concepts are annotated within the CRR relationships (Table 1) . [3, 4] , and has been described as "the best known, the most visited, the most written about, the most sung about, the most parodied work of art in the world" [5] . The painting's novel qualities include the subject's expression, which is frequently described as enigmatic [6] , the monumentality of the composition, the subtle modeling of forms, and the atmospheric illusionism [7] . The painting is likely of the Italian noblewoman Lisa Gherardini [8] the wife of Francesco del Giocondo, and is in oil on a white Lombardy poplar panel. It had been believed to have been painted between 1503 and 1506; however, Leonardo may have continued working on it as late as 1517. Recent academic work suggests that it would not have been started before 1513 [9] [10] [11] [12] . It was acquired by King Francis I of France and [14] (equivalent to $650 million in 2018). Other links found in the article to typical types of Heritage objects -books, documents, museum objects -in LAM collections, or to entities as Agents, Concepts, Events/Processes, Time, and Places, which comprise the entities of the proposed ontology (Marcondes 2019, 133) , are also considered. Such links follow. Table 3) . Dom Casmurro is an 1899 novel written by Brazilian author Joaquim Maria Machado de Assis. Like The Posthumous Memoirs of Bras Cubas and Quincas Borba, both by Machado de Assis, it is widely regarded as a masterpiece of realist literature. It is written as a fictional memoir by a distrusting, jealous husband. The narrator, however, is not a reliable conveyor of the story as it is a dark comedy. Dom Casmurro is considered by critic Afranio Coutinho "a true Brazilian masterpiece, and maybe Brazil's greatest representative piece of writing" and "one of the best books ever written in the Portuguese language, if not the best one to date." The author is considered a master of Latin American literature with a unique style of realism (Jackson 1998 ). Other links are found in the article as follows. "Machado de Assis' life as a translator of Shakespeare, and also his influence from French realism, especially Honoré de Balzac, Gustave Flaubert and Émile Zola", "The Brazilian writer Dalton Trevisan once noted that Dom Casmurro is not to be read as the story of Capitu betraying Bentinho, but as a story of jealousy itself", https://en.wikipedia.org/wiki/Dalton_Trevisan. "A television miniseries titled Capitu, the feminine character of Dom Casmurro, was released in 2008", "MetaLibri Digital Library's Dom Casmurro". In the sequel are presented and discussed sequentially, ordered by Table and line within each Table, Table 2 , Line 8 should be included in the CRR vocabulary, as such a relationship is more specific and implies the relationship expressed in Table 2, Line 7. Table 4 , lines 2, 3, there is not a foreseen CRR relationship between two works of the same author. Table 4 , lines 4, 6, 8, 9, 10, 11, 12, 13, 14, 15 , there is not a foreseen CRR relationship similar to a thesaurus Associative Relationship. Table 4 , lines 5, 7, 16, there is not a foreseen CRR relationship between a heritage object and entities such as artistic movements or periods to which it is associated. The same relationship appears in Table 2 , Line 2. Maybe the addition of such a relationship to the CRR vocabulary would be useful. Table 5 , line 8, there is not (it was not foreseen) a CRR relationship similar to the thesaurus Associative Relationship. As in Table 4 Lines 4, 6, 8, 9, 10, 11, 12, 13, 14, 15, maybe the adding of an Associative Relationship would enable curators to interlink HO and them with external entities without adding more specific relationships that could make the CRR vocabulary hard to be used by culture curators. the bibliographic support to the claims made within the page, and External links, i.e. links to resources other than Wikipedia pages. Most of the links in each page are to other Wikipedia pages and documents thus simulating a LOD environment and the adequacy of CRR vocabulary to assign meaning to such links. The result tables can also be easily converted in RDF N triples and loaded in triplestores, thus opening up a new and wide perspective of tests to be done. The results of the test described above suggest the inclusion of the CRR vocabulary of the following relationships: Belongs to the collection of the cultural heritage institution, Link to the artistic movement or period, Link to a downloadable version, and a generic Associative Relationship. Two more cases are foreseen to be tested, the romance Don Quijote de La Mancha by the Spanish writer Miguel de Cervantes Saavedra, https://es.wikipedia.org/wiki/Don_ Quijote_de_la_Mancha, and the panel painting Guerra e Paz by the Brazilian painter Candido Portinari that decorates United Nations headquarters in New York, https://pt. wikipedia.org/wiki/Guerra_e_Paz_(Candido_Portinari). A further test of the potentialities of the CRR vocabulary would be to convert the tables of each case into the triples that make up each one and load the corresponding datasets into a triple store. This will enable the conceptualization to be queried using SPARQL and the results evaluated according to its potentialities to answer queries that retrieve relevant facts about each case. After such tests, we intend to release a new version of CRR vocabulary. There is another question in the development of the CRR vocabulary, how to implement a tool that enables culture curators to easily and friendly use the CRR vocabulary to annotate and interlink cultural heritage collections to create enhanced cultural resources made up of HO of collections in different institutions. The proposed CRR vocabulary may be useful to enable the construction of new, curated, and innovative resources as virtual exhibitions, virtual classes, etc., build on the bases of digital resources from different LAM collections. As a project requisite the CRR vocabulary must be kept small and concise as to be used intuitively and without the need for any special training by culture curators. As the number of LOD dataset available on the web grows the success the value aggregated by interlinking cultural heritage digital collections as proposed by Berners-Lee (2006) will be highlighted. The success of such an enterprise depends on cooperation among heritage institutions to interlink their LOD collections. The future of interlinked, interoperable and scalable metadata Museum linked open data: ontologies, datasets, projects. Digital Present Records in Context: a conceptual model for archival description IFLA. International Federation of Library Associations and Institutions. Study Group on Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications New Series. K.G. Saur Madness in a Tropical Manner VIAFbot and the integration of library data on Wikipedia The Linked Open Data Cloud Towards a vocabulary to implement culturally relevant relationships between digital collections in heritage institutions NAISC: an authoritative linked data interlinking approach for the library domain A Conceptual Model for Bibliographic Information SPARQL Query Language for RDF. W3C Silk -a link discovery framework for the web of data Semantic enrichment for enhancing LAM data and supporting digital humanities This work was carried out with the support of the Brazilian agencies CAPES -Financing Code 001, and CNPq, grant number 305253/2017-4