Japanese Military “Comfort Women” Knowledge Graph: Linking Fragmented Digital Records
ARTICLE
Japanese Military “Comfort Women” Knowledge Graph
Linking Fragmented Digital Records
Haram Park and Haklae Kim
INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2023
https://doi.org/10.6017/ital.v42i1.15799
Haram Park (haram9553@gmail.com) is Master Student, Library and Information Science,
Chung-Ang University, Haklae Kim (haklaekim@cau.ac.kr) is Associate Professor, Library and
Information Science, Chung-Ang University. © 2023.
ABSTRACT
Materials related to Japanese military “comfort women” in Korea are managed by several
institutions. Each digital archive has their own metadata schema and management policies. So far, a
standard or a common guideline for describing digital records is not formalized.
We propose a Japanese military “comfort women” knowledge graph to semantically interlink the
digital records from distributed digital archives. To build a Japanese military “comfort women”
knowledge graph, digital records and descriptive metadata were collected from existing digital
archives. A list of metadata was defined by analyzing commonly used properties and a knowledge
model designed by reusing standard vocabularies. Knowledge was constructed by interlinking the
collected records, external data sources, and enriching data. The knowledge graph was evaluated
using the FAIR data maturity model.
INTRODUCTION
In December 1991, Kim Hak-Sun (a Korean) became the first woman to disclose and identify as a
former “comfort woman.”1 In February 1992, Ms. Itoh Hideko discovered three telegrams in the
Japanese Defense Agency stating that not only Korean but also Taiwanese women had been
dispatched as “comfort women.”2 Between 1931 and 1945, the Imperial Japanese Army forced
approximately 200,000 girls and young women from Korea, China, and other countries, known as
“comfort women,” into sexual slavery. These women came from all over East Asia, but the
majority, over 80 percent, were from South Korea.3 It was not until the early 1990s that survivors
began to share their stories and demand justice. Many international organizations and volunteers
continue to participate in advocacy and campaigns to solve the Japanese military sexual slavery.4
However, the Japanese government has never accepted legal responsibility or agreed to pay
reparations.5
Regardless of political interpretation, we believe it is critical to reveal the historical truth. The
records of Japanese military “comfort women” serve as objective evidence to prove the fact that
the Japanese military indulged in sexual slavery. As there are now only 13 elderly survivors left in
South Korea, the records could serve as one of the key pieces of evidence for understanding the
Japanese military “comfort women.” In Korea, materials related to Japanese military “comfort
women” are managed by the National Archives of Korea and some private organizations, and some
of this material is being provided as digital archives.6
Digital archives systematically describe digital resources so that users can effectively search and
view the materials.7 In general, digital archives describe digital resources based on guidelines for
expressing standard metadata elements and data values that are mainly used in the domain. For
mailto:haram9553@gmail.com
mailto:haklaekim@cau.ac.kr
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 2
PARK AND KIM
example, the US Library of Congress is creating digital resources with varying levels and types of
descriptive metadata, providing an increasingly coordinated and standardized approach to the
creation and management of descriptive metadata.8 However, for the digital archives related to
Japanese military “comfort women,” there are no recommendations or agreed guidelines on
metadata for describing digital records. Even when metadata standards such as Dublin Core are
used, there remain variations in describing metadata elements of digital records. Therefore,
linking or integrating the digital records with different metadata structures and values is difficult.
To solve this problem, a metadata model to describe digital records related to Japanese military
“comfort women” should be developed, and digital records should be systematically described. If
the various pieces of information contained in the digital record are expressed in a format that a
machine can understand, a precise search is possible based on the meaning and relationship of the
data. A knowledge graph can be applied to define the relationships between the various entities
included in Japanese military “comfort women” records. In particular, the records existing in a
distributed digital archive can be expressed as objects that can be identified on the web, so that
different records can be linked at a semantic level.9
This study proposes a method to interlink and search digital records of the digital archives of
Japanese military “comfort women.” For describing and linking distributed digital records, a set
of metadata elements was proposed, and a knowledge model was defined by examining the
common metadata model and the existing RDF vocabulary. The collected digital records were
constructed as a knowledge graph, using a knowledge model. The knowledge graph was evaluated
by applying the FAIR data maturity model.10 The remainder of this paper is organized as follows.
The literature review introduces the Japanese military “comfort women” issue and describes the
concepts and research trends related to knowledge graphs. We then introduce the case of Korean
digital archives containing materials about the Japanese military’s use of “comfort women.” Next,
we describe the process of developing a knowledge graph in detail and define SPARQL queries,
comparing the search results of existing digital archives and knowledge graphs, and describing
differences in FAIR data maturity. Finally, the research results are summarized, and future
research directions are described.
LITERATURE REVIEW
Japanese Military “Comfort Women”
The Japanese military “comfort women” issue was made official in 1991 when the Korean Council
for the Women Drafted for Military Sexual Slavery by Japan and the Korean victims appealed to
solve the problem themselves,11 through activities such as the testimony of victims,12 and the
activities of individual researchers and civic groups,13 raising issues through the international
community and through domestic and international judicial procedures.14 Through these efforts,
the Japanese military “comfort women” issue has been seen as a problem of forced mobilization,
human trafficking, sexual exploitation, and extreme human rights violations by the ruling state
targeting women in the colonized state.15 However, the Japanese military “comfort women” were a
cause of conflict and confrontation between victims and their families, private organizations, and
the South Korean and Japanese governments. For example, Mark Ramseyer defined the Japanese
military “comfort women” in his paper as prostitutes (ianfu) who, based on game theory, engaged
in prostitution to the Japanese military for high wages during the Pacific War.16 This sparked a
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 3
PARK AND KIM
debate about historical distortion.17 Some argue that the “comfort women” issue is not viewed as a
conflict between Korea and Japan but as a women’s and a universal human rights issue.18
From a political and social point of view, research on the Japanese military “comfort women” is
active, but insufficient research has been conducted on archives and records management due to
licensing of records, data sharing, and a lack of qualified personnel. Various licensing policies and
sharing limitations apply to the records kept by different institutions. As a result, the preservation
and exchange of documents are nominal, and they are administered with a minimal amount of
personnel. Records are essential evidence for discussing historical truths. Fifteen organizations,
from eight countries, have tried to list the records of the Japanese military “comfort women” as
UNESCO World’s documentary heritage.19 A total of 2,744 records have been requested, including
materials that prove the Japanese military’s “comfort women” system or materials produced by
“comfort women” victims. However, the decision to list Japanese military “comfort women” records
as UNESCO documentary heritage has been postponed due to tensions between South Korea and
Japan.20 The National Archives of Korea has selected materials related to the “comfort women” of
the Japanese military as a nation-designated record and is integrating and managing these
records.21 However, most records are scattered in various university research institutes,
nongovernmental organizations, and institutions, and it is difficult to systematically preserve and
manage them.
Reuse of Ontology Vocabularies and FAIR Data Principles
The records of the Japanese military “comfort women” are not systematically managed, and
existing digital archives tend not to contain sufficient information contained in the original
records. A previous study suggested a metadata schema for the integrated management of the
records of Japanese military “comfort women.”22 However, although most studies suggest common
metadata elements, they do not include methods for representing and processing records in a
machine-readable format.23 Reusing vocabularies is recommended to foster interoperability and
facilitate knowledge use by interlinking new datasets to existing resources. Some previous efforts
demonstrate a way of interlinking digital resources on the Web by using several ontology
vocabularies.24 In particular, Freire et al. propose a mapping from Schema.org metadata to the
Europeana Data Model. The proposed method is suitable for metadata aggregation in the area of
cultural heritage by enriching the semantics of the Schema.org model.25
The FAIR data principles are designed to reinforce the reusability of research data and are defined
as four principles: Findable, Accessible, Interoperable, and Reusable.26 In particular, the FAIR
principles emphasize the ability of machines to find and use data on their own, in accordance with
the research data management environment.27 Initially, the FAIR principles were recognized as a
tool to enhance the reusability of research data in the context of open science; however, they are
now being extended to a universal framework for preserving and managing data in the long
term.28
Representative examples include FAIR Metrics,29 the data maturity model of the RDA (Research
Data Alliance) working group,30 and FAIRsFAIR.31 FAIR Metrics presents an evaluation framework
that can measure FAIR indices using an automated tool. Discussions on the FAIR principle are also
expanding in digital archives and libraries.32 Koster and Woutersen-Windhouwer propose the
FAIR principle suitable for LAM (libraries, archives, museums) collections and suggest a practical
method to increase the reusability of digital cultural heritage.33
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 4
PARK AND KIM
DIGITAL ARCHIVES OF JAPANESE MILITARY “COMFORT WOMEN”
The records or documents of the Japanese military “comfort women” are managed in the form of
digital archives by national and private institutions. Table 1 summarizes the status of digital
archives held by each institution as representative digital archives. The Wednesday
Demonstration Archive is a digital archive operated by the Korean Council. It contains a record of
the “regular demand demonstration to solve the Japanese military’s sexual slavery problem” that
began in January 1992. The archive contains 1,085 records, and each record is described with 17
metadata elements. Archive 814, named for the annual Day of Remembrance of the Japanese
Military “comfort women” observed on August 14, aims to develop efforts and research results
Table 1. Status of records by archives
Archives Organization Number
of digital
records
Number of
descriptive
metadata
URL
Wednesday
Demonstration
The Korean
Council
1,085 17 https://womenandwarmuseum.net
Archive 814 Research
Institute on
Japanese
Military Sexual
Slavery
596 20 https://www.archive814.or.kr/
Digital
collection of
“comfort
women”
Seoul
Metropolitan
Archives
137 25 https://archives.seoul.go.kr/class/CC-
0003
Gender
Archive
Seoul
Foundation of
Women and
Family
408 88 http://genderarchive.or.kr/
Nation-
designated
Archives No. 8
National
Archives of
Korea
27 20 https://theme.archives.go.kr//next/n
ationalArchives/subPage/nationalArc
hives7.do
Note: Archive names in the following sections are abbreviated for readability: WED: Wednesday
Demonstration; A814: Archive 814; SMA: Digital Collection of “Comfort Women”; GEN: Gender
Archive; NAK: Nation-designated Archives No. 8.
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 5
PARK AND KIM
surrounding the “comfort women” issue. Archive 814 has 596 records, including domestic and
foreign legal records, official documents, collections by subject, chronological tables, and book
lists. The Seoul Archives provides documents proving the existence of Japanese military “comfort
women” and comfort stations from documents produced by the Allied Forces during World War II.
In total, 137 records were provided, and each record consisted of 25 descriptive metadata
elements. The Gender Archive provides documents on the issue of “Military Sexual Slavery by
Japan” and “The Women’s International War Crimes Tribunal on Japan’s Military Sexual Slavery.” A
total of 408 records were provided, with 88 metadata elements describing each record. The
National Archives of Korea has designated records related to Japanese military “comfort women”
as Nation-Designated Archives No. 8. Among the records (approximately 3,060 cases) owned by
House of Sharing (http://www.nanum.org/eng/main/index.php) and Daegu Citizen Forum for
Halmuni (http://www.1945815.or.kr/), 27 records are selected as major records, and digitized
records including 20 metadata elements are provided.
DEVELOPMENT OF JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH
Data Preprocessing
A total of 2,253 records and metadata were collected from the five digital archives. Excluding
records with insufficient information (A814 and NAK had three and two documents, respectively),
2,248 records were constructed as a knowledge graph. Metadata values in the collected records
are not consistently expressed. For example, the Seoul Archives indicates the institution in the
form “[organization/group] Jinseong Jeong Research Team, Seoul National University, 2015,”
whereas “Kunji Takei, Governor of Yamagata Prefecture” in Archive 814 has a combination of
person, organization, and his position together. These values are separated into relevant
categories and described in the corresponding metadata elements (e.g., “Kunji Takei, Governor of
Yamagata Prefecture” is divided to “Kunji Takei” (name) and “Yamagata Prefecture” (his
position)). The units for expressing metadata values such as “production date” and “language” are
also unified, and errors in some data values are corrected directly (e.g., “Gabrelle Kirk McDonald”
is changed to “Gabrielle Kirk McDonald”, restoring the “i” to her first name). In addition, a new
classification system is defined by aligning and integrating existing categories, since digital archives
uses different categories (e.g., Book/Publication, Document).
A Model of Designing a Knowledge Graph
Two tasks are performed to transform the collected data into a knowledge graph. Since the
metadata elements used in digital archives are different, metadata properties commonly used in
archives are extracted. For common metadata, the scope of reuse is determined by investigating
the existing RDF vocabularies and adding to the proposed knowledge model.
Common metadata elements among the selected archives are defined by the following two
criteria:
1. Metadata elements commonly used in all archives were extracted. Metadata elements present
in all five archives, such as Title, Description, Identifier, License, and URL are mandatory.
Metadata elements defined in two or more archives, such as “production date” and
“language,” are optional properties. Even if the metadata name written in Korean is
different, it is regarded as the same metadata element if its purpose is to indicate the same
data value.
http://www.nanum.org/eng/main/index.php
http://www.1945815.or.kr/
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 6
PARK AND KIM
2. Metadata elements not used in the actual data were excluded from the model. For example,
GED has 88 metadata elements. However, there were no data values for 60 of these
elements.
Table 2 summarizes a list of metadata elements for describing the records of digital archives.
A proposed model should be able to represent the context of individual records and their own
properties. After investigating semantic relationships between common metadata elements and
existing vocabularies, the proposed model is defined. The model reuses existing vocabularies, such
as DCMI (Dublin Core Metadata Initiative) metadata terms for describing online resources, SKOS
(Simple Knowledge Organization System) for representing taxonomies, RiC-O (Records in Contexts –
Ontology) for describing digital records, and Schema.org for supporting universal search on the
Web. The basic structure of the Japanese military “comfort women” knowledge model is illustrated
in figure 1. All records that are digital resources (“#Record”) are instances of
schema:ArchiveComponent and represent records provided by each archive. The individual
records contain information on several people and organizations. For example, the schema:creator
property describes a creator who creates a record, the schema:contributor can be used to
represent a person who contributes a record, and the schema:mentions is to represent a thing
related to a record. An archive manager who holds or maintains a record can be described using
the schema:holdingArchive property, and the archive manager is represented by the
schema:ArchiveOrganization class. If the value of each property is a type of organization, then the
value of rdfs:range is the schema:Organization class.
Figure 1. Abstract structure of the Japanese Military “comfort women” knowledge graph.
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 7
PARK AND KIM
Table 2. Mapping results of both metadata elements and models of the knowledge graph
WED A814 SMA NAK GEN Property Entity Value
Mand-
atory
title title title title dc:title schema:title schema:
ArchiveComponent
xsd:string Yes
identifier registration
number
Identification
number
Management
number
dc:identifier schema:Identifier schema:
ArchiveComponent
xsd:string Yes
description scope and
content
description dc:description schema:description schema:
ArchiveComponent
xsd:string Yes
production
date
production
date
year of
production
itm:date schema:dateCreated schema:
ArchiveComponent
xsd:dateTime No
creator creator production
institution
itm:creator schema:creator schema:
ArchiveComponent
schema:Person;
schema:Organizati
on
Yes
license license rights statement license cc:license schema:
ArchiveComponent
cc:License Yes
management
organization
management
organization
service
provider
management
organization
schema:holdingArchive schema:
ArchiveComponent
schema:
ArchiveOrganizatio
n
Yes
URL URL URL URL schema:sameAs schema:
ArchiveComponent
schema:URL Yes
attachment
view
attachment
view
attachment view attachment
view
File schema:mainEntityOfPage schema:
ArchiveComponent
schema:URL No
attachment download download schema:downloadUrl schema:
ArchiveComponent
schema:URL No
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 8
PARK AND KIM
WED A814 SMA NAK GEN Property Entity Value
Mand-
atory
record type record type record type record type itm:typeOfRecord rico:hasContentOfType schema:
ArchiveComponent
skos:Concept Yes
format type of
document
itm:formatOfRecord rico:hasDocumentaryFormT
ype
schema:
ArchiveComponent
skos:Concept No
number of
pages
number of
pages
itm:size/amount schema:numberOfPages schema:
ArchiveComponent
xsd:nonNegativeIn
teger
No
language itm:langauage schema:inLanguage schema:
ArchiveComponent
schema:Language No
periodic
classification
temporal
coverage
schema:temporalCoverage schema:
ArchiveComponent
xsd:string No
related terms Related
information
itm:relatedPerson;
itm:relatedOrganizati
on;
itm:relatedEvent
schema:mentions schema:
ArchiveComponent
schema:Person;
schema:Organizati
on;
schema:Event
No
donor/collect
or
contributor,
collector/provid
er
itm:donor schema:contributor schema:
ArchiveComponent
schema:Person;
schema:Organizati
on
No
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 9
PARK AND KIM
Data Enrichment and Transformation
Data enrichment refers to the process of appending or otherwise enhancing the collected data
with the relevant context obtained from additional sources. In the collected digital records, the
entities of person and organization are linked to Wikidata (http://wikidata.org) and the enriched
information is expanded to a knowledge graph using the RDF extension of OpenRefine
(http://openrefine.org).
A total of 654 terms were extracted from the existing archives for people and organizations. After
removing duplicates, the dictionary contained 150 people and 312 organizations. For each term in
the dictionary, a matching entity is searched for in Wikidata. If the entity name matches
completely, the URI of Wikidata is assigned automatically. Thirty-eight percent of people (57) and
28 percent of organizations (88) matched between the dictionary and Wikidata. Matched entities
can be added to the knowledge graph by extracting the properties and values of Wikidata. For
example, Kim Bok-dong is linked to Wikidata (Q16175111), and citizenship, occupation, place of
birth, and gender, which did not exist in the collected data, are added to the knowledge graph. As a
result, six properties are representing the extended properties were mapped (e.g., citizenship is
mapped to fetched from the person and three attributes are obtained from the organization. A
total of nine properties were expanded by data enrichment, and vocabularies for
schema:nationality).
The constructed knowledge graph had 47,499 triples for 3,069 entities. The collected records and
information contained in the records included 2,560 objects. The number of entities expanded
through Wikidata was 145 (88 individuals and 57 entities) and were added to the organization.
The enriched entity contained 2,144 explicit statements and 102 inferred statements. As shown in
table 3, the total number of triples was 47,327 for explicit statements and 172 for inferred
statements. The knowledge graph is published on GitHub (https://github.com/hike-lab/comfort-
women-archives).
Table 3. Statistics of the constructed knowledge graph
Entities Explicit
statements
Implicit
statements
Sum of
Statements
Collected entities 2,560 45,213 70 45,283
Enriched entities 509 2,114 102 2,216
Sum 3,069 47,327 172 47,499
Figure 2 shows the information about “Jan Ruff O’Herne” in the knowledge graph. She is a Dutch-
Australian sexually enslaved by the Japanese military and has been active as a human rights
activist since she disclosed in 1992 that she had been sexually enslaved by the Japanese army. The
knowledge graph links several records produced or contributed by O’Herne. WED’s record
(wednes-demo-368) links Jan Ruff O’Herne with related information (schema:mentions), and
A814’s record (A814-107) links Jan Ruff O’Herne as the record’s creator (schema:creator).
Existing digital archives do not provide specific information about the person, organization, or
http://wikidata.org/
http://openrefine.org/
https://github.com/hike-lab/comfort-women-archives
https://github.com/hike-lab/comfort-women-archives
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 10
PARK AND KIM
event described in metadata. If anyone does not know that Jan Ruff O’Herne was a victim of the
Japanese military “comfort women,” it is difficult to fully understand the record of “Letter from Jan
Ruff-O’Herne in support of US Congress Resolution 121 in 2007” provided by A814. However, as
shown Figure 2 the knowledge graph provides a rich context for understanding her and her
associated records.
Figure 2. Semantic relationships of Jan Ruff O’Herne on the knowledge graph.
EVALUATION
The evaluation of the constructed knowledge graph was carried out in two ways: 1)
discoverability among five archives and the knowledge graph is compared by using several
semantic queries, and 2) the FAIR data evaluation was applied to the knowledge graph and
existing digital archives.
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 11
PARK AND KIM
Discoverability
All queries aim to find out all digital records across five digital archives by using search conditions
and are designed by the RDF standard query language (SPARQL). Table 4 is an example query
(Q3), and the records produced from 1990 to 1994 in digital resources are sorted in ascending
order. At this time, the values of all the objects must exactly match the rdf:type, and regardless of
the physical location, the object is identified based on the URI and included in the search result.
Table 4. A SPARQL query example (Q3)
PREFIX schema:
PREFIX rdfs:
PREFIX xsd:
SELECT ?title ?date ?ArchiveOrganizationName
WHERE {
?record rdf:type schema:ArchiveComponent;
schema:name ?title;
schema:dateCreated ?date;
schema:holdingArchive ?ArchiveOrganization .
?ArchiveOrganization rdfs:label ?ArchiveOrganizationName
FILTER (?date >= ‘1990-01-01’^^xsd:date && ?date <= ‘1994-12-
31’^^xsd:date)
}
ORDER BY ?date
Table 5. List of SPARQL queries
Queries Description
Number of
results
Q1 Select all records of Japanese military “comfort women” 2,248
Q2 Select all records whose record type is ‘Document’ 1,793
Q3 Select records produced between 1990 and 1994, and sort in
ascending order
345
Q4 Select all information about ‘Ministry of Gender Equality and
Family’
480
Q5 Select all information about ‘Jan Ruff-O’Herne’ 120
Table 5 summarizes the queries constructed to search for a knowledge graph, and figure 4 shows
the results of the comparison between the search of the existing archives and the query of the
knowledge graph. The existing archives provide keyword-based search without considering the
http://schema.org/
http://www.w3.org/2000/01/rdf-schema
http://www.w3.org/2001/XMLSchema
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 12
PARK AND KIM
meaning and relationship of search keywords. Furthermore, they do not share any common
categories or classifications among others. A knowledge graph that semantically links records in
different digital archives also enables accurate and relevant discovery. Q1, Q2, and Q3 find all
digital records matching the query condition and information semantically linked to those
records. For example, GEN had 169 records produced between 1990 and 1994. Since the archive
did not support the search for a type of a record, it is not possible to specifically search for the
record type in Q2 and Q3. However, in the knowledge graph, the record type is
rico:hasContentTypeOf; thus, information is expressed at the semantic level, such that 169 related
records can be retrieved. Q4 and Q5 discover entities based on their semantic relations. “Ministry
of Gender Equality and Family” in Q4 is an organization, and each government uses the name of
the department slightly differently (e.g., “Ministry of Gender Equality”). Q5 discovers different
entities in existing archives. The knowledge graph semantically defines the variant of entities and
their types. As a result, the knowledge graph provided 104 more search results in Q4 and nine
more search results in Q5 than the existing archives.
Figure 3. Search results of knowledge graph and existing digital archives.
FAIR Data Evaluation for the Knowledge Graph
The FAIR data evaluation for the constructed knowledge graph reveals a clear improvement
compared to existing archives. Findable, Accessible, and Interoperable follow the FAIR data
principles. All objects of the constructed knowledge graph can be identified by URI, and metadata
elements are described with a standard vocabulary, so that the machine can search for digital
resources. Digital resources in the existing archives are accessible over the Web, therefore
Accessible received a pretty good score. However, access to the metadata of ind ividual records
was restricted, as the majority of metadata elements were described as simple strings instead
of machine-readable forms. All the information in the knowledge graph has improved
accessibility by providing URIs to metadata elements. In addition, to avoid being linked to the
resources of the existing archive, standardized vocabulary, such as schema.org and Dublin Core,
was applied to increase the connectivity between data, and rich contextual information was
provided through semantic linkage with Wikidata. As shown in figure 5, the evaluation score of
Reusable is 0.7, which is 2.9 times better than the existing archives. The metadata elements in the
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 13
PARK AND KIM
knowledge graph clearly describe a license for reuse. In particular, the Creative Commons License
and the Korea Open Government License provide machine-readable URI information to enhance
reusability. However, data for which licensing information is not clear or not provided are left
blank.
In summary, the constructed knowledge graph semantically connects digital resources fragmented
in different archives, enables a rich search, and satisfies all FAIR data indicators.
Figure 4. Results of FAIR data evaluation of the knowledge graph and existing digital archives.
CONCLUSION
This study proposed a method for linking and searching digital records from the Japanese military
“comfort women” digital archive. In Korea, materials related to Japanese military “comfort
women” are managed by several institutions, some of which are provided as digital archive
services. However, the existing digital archives describe digital records without common standards
or guidelines, and the metadata of individual records are expressed in text format in HTML
documents without explicitly expressing their structure and meaning. Therefore, digital records
that exist in different digital archives cannot be connected even if they have the same context, such
as subject, event, person, or institution. This study proposed a common metadata model for the
descriptive metadata of digital records and constructed a knowledge graph in which digital
records are semantically interlinked. Furthermore, the FAIR data maturity model was used to
evaluate the constructed knowledge graph. The constructed knowledge graph semantically
defines the relationship between the various entities included in the records of Japanese military
“comfort women.” In particular, records existing in a distributed digital archive are expressed as
objects that can be identified on the Web, so that different records can be explored at a semantic
level. The knowledge model proposed herein is the first attempt to describe digital records related
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 14
PARK AND KIM
to Japanese military “comfort women”; thus, it can serve as a starting point for discussing a
comprehensive model for describing fragmented digital records worldwide. We also apply an open
license to disclose all the collected records and construct knowledge graphs for further
collaboration.
However, there are also considerations for the construction and management of high-quality
digital records. First, the records must contain accurate and rich semantic information. The
collected digital archives have an average of 16 metadata elements, but because the metadata
elements and values differ among institutions, the data accuracy needs to be improved. Second, it is
necessary to clearly provide conditions for the use of records. Most records do not provide a clear
license for terms of use. It is important to explicitly express and provide international or Korean
standard licenses for digital resources. Finally, it is necessary to discuss the records of Japanese
military “comfort women” using open data. The sharing of records and the promotion of
information exchange between domestic and international scholars can both be facilitated by the
opening of records, which can also play a significant role in the long-term preservation and
sharing of records. As a majority of records are fragmented and difficult to discover and manage, it
is necessary to find an effective method to preserve the records by opening and sharing them and
to lead research cooperation at home and abroad.
ENDNOTES
1 Chunghee Sarah Soh, “The Korean ‘Comfort Women’: Movement for Redress,” Asian Survey 36,
no. 12 (1996): 1226–40, https://doi.org/10.2307/2645577.
2 Shogo Suzuki, “The Competition to Attain Justice for Past Wrongs: The ‘Comfort Women’ Issue in
Taiwan,” Pacific Affairs 84, no. 2 (June 2011): 223–44, https://doi.org/10.5509/2011842223.
3 Center for Korean Legal Studies, “Military Sexual Slavery, 1931–1945,” accessed October 17,
2022, https://kls.law.columbia.edu/content/military-sexual-slavery-1931-1945.
4 Kathryn J. Witt, “Comfort Women: The 1946–1948 Tokyo War Crimes Trials and Historical
Blindness,” The Great Lakes Journal of Undergraduate History 4, no. 1 (September 2016): 17–
34.
5 “South Korea: Lawsuits against Japanese Government Last Chance for Justice for ‘Comfort
Women’,” Amnesty International, accessed October 17, 2022,
https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-the-
japanese-government-last-chance-for-justice-for-comfort-women/.
6 SinCheol Lee and Hye-in Han, “Comfort Women: A Focus on Recent Findings from Korea and
China,” Asian Journal of Women’s Studies 21, no. 1 (March 2015): 40–64,
https://doi.org/10.1080/12259276.2015.1029229.
7 Itza A. Carbajal and Michelle Caswell, “Critical Digital Archives: A Review from Archival Studies,”
The American Historical Review 126, no. 3 (September 2021): 1102–20,
https://doi.org/10.1093/ahr/rhab359.
https://doi.org/10.2307/2645577
https://doi.org/10.5509/2011842223
https://kls.law.columbia.edu/content/military-sexual-slavery-1931-1945
https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-the-japanese-government-last-chance-for-justice-for-comfort-women/
https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-the-japanese-government-last-chance-for-justice-for-comfort-women/
https://doi.org/10.1080/12259276.2015.1029229
https://doi.org/10.1093/ahr/rhab359
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 15
PARK AND KIM
8 “Library of Congress Metadata for Digital Content – Master Data Element List Version 4.1,”
Library of Congress, accessed October 4, 2022,
https://www.loc.gov/standards/mdc/elements/MasterDataElementList-20120215.doc.
9 Stefano Ferilli and Domenico Redavid, “An Ontology and Knowledge Graph Infrastructure for
Digital Library Knowledge Representation,” Italian Research Conference on Digital Libraries,
(January 2020): 47–61, https://doi.org/10.1007/978-3-030-39905-4_6.
10 Mark D. Wilkinson et al., “Evaluating FAIR Maturity through a Scalable, Automated, Community-
Governed Framework,” Scientific Data 6, no. 174 (September 2019): 1–12,
https://doi.org/10.1038/s41597-019-0184-5.
11 Na-Young Lee, “The Korean Women’s Movement of Japanese Military ‘Comfort Women’:
Navigating between Nationalism and Feminism,” The Review of Korean Studies 17, no. 1 (June
2014): 71–92.
12 Jaeyeon Lee, “The Ethno-Nationalist Solidarity and (Dis)comfort in the Wednesday
Demonstration in South Korea,” Gender, Place & Culture (2021): 1–14,
https://doi.org/10.1080/0966369X.2021.2016655.
13 Lee and Han, “Comfort Women,” 40–64.
14 Witt, “Comfort Women,” 17–34.
15 Na-Young Lee, “The Korean Women’s Movement,” 71–92.
16 J. Mark Ramseyer, “Contracting for Sex in the Pacific War,” International Review of Law and
Economics 65, (March 2021): 105971, https://doi.org/10.1016/j.irle.2020.105971.
17 Andrew Gordon and Carter Eckert, “Statement by Andrew Gordon and Carter Eckert Concerning
J. Mark Ramseyer, ‘Contracting for Sex in the Pacific War’,” accessed October 4, 2022,
https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37366904.
18 Jaeyeon Lee, “The Ethno-Nationalist Solidarity and (Dis)comfort,” 1–14.
19 Heisoo Shin, “Voices of the ‘Comfort Women’: The Power Politics Surrounding the UNESCO
Documentary Heritage,” The Asia–Pacific Journal 19, no. 5 (March 2021): 1–19.
20 Ian E. Wilson, “The UNESCO Memory of the World Program: Promise Postponed,” Archivaria 87,
(May 2019): 106–37.
21 Yunshin Hong, “Epilogue: ‘Comfort Stations’ as Sites of Remembrance,” in “Comfort Stations” as
Remembered by Okinawans during World War II, ed. Robert Ricketts (Leiden: Brill, 2020), 432–
59.
22 Ji Hyeon Bong and Young Joon Nam, “A Study on the Design of Metadata Elements for
Management of Oral History Archives about Sexual Slavery by Japan’s Military,” Journal of
https://www.loc.gov/standards/mdc/elements/MasterDataElementList-20120215.doc
https://doi.org/10.1007/978-3-030-39905-4_6
https://doi.org/10.1038/s41597-019-0184-5
https://doi.org/10.1080/0966369X.2021.2016655
https://doi.org/10.1016/j.irle.2020.105971
https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37366904
INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2023
JAPANESE MILITARY “COMFORT WOMEN” KNOWLEDGE GRAPH 16
PARK AND KIM
Korean Society of Archives and Records Management 19, no. 1 (February 2019): 225–50,
https://doi.org/10.14404/JKSARM.2019.19.1.225.
23 Haram Park and Haklae Kim, “A Knowledge Graph on Japanese ‘Comfort Women’: Interlinking
Fragmented Digital Archival Resources,” Journal of Korean Society of Archives and Records
Management 21, no. 3 (August 2021): 61–78,
https://doi.org/10.14404/JKSARM.2021.21.3.061.
24 Myung-Ja K. Han et al., “Exposing Library Holdings Metadata in RDF Using Schema.Org
Semantics,” International Conference on Dublin Core and Metadata Applications, (September
2015): 41–49, https://dcpapers.dublincore.org/pubs/article/view/3772.
25 Nuno Freire, Valentine Charles, and Antoine Isaac, “Evaluation of Schema.org for Aggregation of
Cultural Heritage Metadata,” Semantic Web (June 2018): 225–39,
https://doi.org/10.1007/978-3-319-93417-4_15.
26 Mark D. Wilkinson et al., “The FAIR Guiding Principles for Scientific Data Management and
Stewardship,” Scientific Data 3, no. 160018 (March 2016): 1–9,
https://doi.org/10.1038/sdata.2016.18.
27 “FAIRification Process,” GO FAIR, accessed October 4, 2022, https://www.go-fair.org/fair-
principles/fairification-process/.
28 Christian Haux and Petra Knaup, “Using FAIR Metadata for Secondary Use of Administrative
Claims Data,” Studies in Health Technology and Informatics 264 (August 2019): 1472–73,
https://doi.org/https://doi.org/10.3233/SHTI190490.
29 Wilkinson et al., “Evaluating FAIR Maturity,” 1–12.
30 Christophe Bahim et al., “The FAIR Data Maturity Model: An Approach to Harmonise FAIR
Assessments,” Data Science Journal 19, no. 1 (October 2020): 41, https://doi.org/10.5334/dsj-
2020-041.
31 Ansuriya Devaraju et al., “FAIRsFAIR Data Object Assessment Metrics (v0.4),” FAIRsFAIR,
(October 2020): https://doi.org/10.5281/zenodo.4081213.
32 Silvia Calamai and Francesca Frontini, “FAIR Data Principles and Their Application to Speech
and Oral Archives,” Journal of New Music Research 47, no. 4 (May 2018): 339–54,
https://doi.org/10.1080/09298215.2018.1473449; Gustavo Candela et al., “Reusing Digital
Collections from GLAM Institutions,” Journal of Information Science 48, no. 2 (August 2020):
251–67, https://doi.org/10.1177/0165551520950246; Danuta Nitecki and Adi Alter, “Leading
FAIR Adoption across the Institution: A Collaboration between an Academic Library and a
Technology Provider,” Data Science Journal 20, no. 1 (February 2021): 6,
https://doi.org/10.5334/dsj-2021-006.
33 Lukas Koster and Saskia Woutersen-Windhouwer, “FAIR Principles for Library, Archive and
Museum Collections: A Proposal for Standards for Reusable Collections,” Code4Lib Journal 40
(May 2018).
https://doi.org/10.14404/JKSARM.2019.19.1.225
https://doi.org/10.14404/JKSARM.2021.21.3.061
https://dcpapers.dublincore.org/pubs/article/view/3772
https://doi.org/10.1007/978-3-319-93417-4_15
https://doi.org/10.1038/sdata.2016.18
https://www.go-fair.org/fair-principles/fairification-process/
https://www.go-fair.org/fair-principles/fairification-process/
https://doi.org/https:/doi.org/10.3233/SHTI190490
https://doi.org/10.5334/dsj-2020-041
https://doi.org/10.5334/dsj-2020-041
https://doi.org/10.5281/zenodo.4081213
https://doi.org/10.1080/09298215.2018.1473449
https://doi.org/10.1177/0165551520950246
https://doi.org/10.5334/dsj-2021-006
Abstract
Introduction
Literature Review
Japanese Military “Comfort Women”
Reuse of Ontology Vocabularies and FAIR Data Principles
Digital Archives of Japanese Military “comfort women”
Development of Japanese Military “comfort women” Knowledge Graph
Data Preprocessing
A Model of Designing a Knowledge Graph
Data Enrichment and Transformation
Evaluation
Discoverability
FAIR Data Evaluation for the Knowledge Graph
Conclusion
Endnotes