key: cord-0888363-ftq38i37
authors: González-Eras, Alexandra; Santos, Ricardo Dos; Aguilar, Jose; Lopez, Alberto
title: Ontological engineering for the definition of a COVID-19 pandemic ontology
date: 2021-12-17
journal: Inform Med Unlocked
DOI: 10.1016/j.imu.2021.100816
sha: 1fdf7f61b98d33e99856389c9f13664f66cf9778
doc_id: 888363
cord_uid: ftq38i37

COVID-19 has generated a lot of information in different formats, and one of them is in the ontology format. Also, there are previous ontologies from other disciplines that can help to analyze the COVID-19 pandemic. Thus, due to the large quantity of COVID-19 information in the form of ontologies, approaches to ontology integration and interoperability could be beneficial. In this context, this research proposes a new ontology, called COVID-19 Pandemic ontology, which is the product of an ontological engineering process proposed in this research that allows the integration of several ontologies to cover all the aspects of this infectious disease. The ontological engineering process defines tasks of fusion, alignment, and linking for integrating the ontologies. The resulting pandemic ontology provides a simple repository for storing information about the COVID-19, reusing existing ontologies, to offer multiple views about the disease, including the social context. This ontology has been tested in different case studies to prove its capabilities to infer useful information about the COVID-19 pandemic.

The year 2020 started with a pandemic due to the outbreak of coronavirus (COVID-19 henceforth). This virus has been quickly extended to the world, with significant effects on society: deaths, economic catastrophes, etc. Also, the COVID-19 pandemic has generated many researches in different domains to help health institutions to fight against this virus. Currently, a large amount of information about this disease is produced at an impressive speed, which grows exponentially every day, with an absence of clear criteria to order, comprehend, use and connect it. between them, and according to their domains, the integration can be a merge/mixture or a linking between them. When the ontologies are in the same knowledge domain, then they must be merged/fused; and when they are complementary, then they must be linked [16, 36, 37] .

In this way, we define a new ontology with the use of different ontologies, where each one answers to a particularly useful aspect for the treatment of COVID-19. In this sense, this ontology is related to the pandemic outbreak and it can cover the most remarkable aspects to understand how and why this virus is propagating. This research also provides information about the critical aspects related to the disease. It considers several aspects about basic classes associated with COVID-19 like "Treatment", "Causes", "Symptoms", "Transmission Mechanism", and "Epidemiology", but also, it considers different classes from other domains such as "Sociocultural", "Socioeconomics", and "Demography". Thus, our ontology includes ontologies about Infectious Disease (e.g., IDO) integrated with information of the contexts (e.g., "Sociocultural" class), carrying out a more complete and deep reasoning process to exploit all the knowledge gathered through these ontologies. We have conducted several experiments with our ontology to answer crucial aspects of the pandemic. In general, the main contributions of this research are: -An ontological engineering process to build ontologies about COVID-19 -An Ontology, called Tepuy-COVID, which mixes ontologies from the COVID-19 domain. -An Ontology, called Covid-19 Pandemic, which models the knowledge about COVID-19 from different dimensions (symptoms, treatments, socio-cultural aspects). -Case studies that analyze the behavior of the pandemic from the developed ontology.

This research is divided into the following sections: Section 2 presents the related researches, with the main ontologies linked to our work. Section 3 defines the main concepts in the domain of Ontological Engineering, section 4 describes the procedure to build our COVID-19 Pandemic ontology based on our ontological engineering process. Section 5 presents some experiments related to our ontology through several case studies, and finally, the conclusions arising from this research are presented.

In this section are described several researches relying on ontological engineering and integration techniques of ontologies directed to different purposes and COVID-19 related ontologies. This section analyzes the related works from the following perspectives: 1. COVID-19 ontologies and ontologies of other domains used to represent the information of the pandemic, 2. Techniques and methods of ontological engineering used to model a knowledge domain and, 3. Metrics and validation schemes used in the ontological domain.

According to FAIR principles, several COVID-19 ontologies offer robustly supported data integration, sharing, reproducibility, and computer-assisted data analysis [22] , indicating that all research data should be findable, accessible, interoperability and reusable [24] . These include CIDO ontology that brings together various models to represent aspects such as similarity to other viruses, common symptoms and drugs that have been attempted to treat the virus, etc. COVID-19 Surveillance Ontology supports surveillance in primary care. DRUGS4COVID195 defines the relationship between medications and COVID-19 symptoms. The COVIDCRFRAPID7 ontology provides semantic references of quiz questions and answers [25] . The CODO ontology defines patient, clinical tests, travel history, available resources, current need, trend study, and growth projections. The latter describes real cases of the pandemic [22] . However, no studies of ontologies above have been integrated with ontologies from other domains to represent the connotations of the disease in different contexts [23] .

Regarding the techniques and methods, in the research [1] , the authors reported a strategy for reusing existing ontologies, many of them related to our approach and applied in this publication. They defined a High-Level collaborative Architecture (HLA) to specify the semantics of objects and their interaction since existing ontologies. Thus, they constructed the ontologies interoperation based on an automatic transforming method embedded in HLA. The output was verified through a consistency verification method, guaranteeing the feasibility of the ontology management strategy proposed. In [2, 4] were presented multi-dimensional collaborative ontology models that integrate a series of sub-ontologies through processes, such as mapping and merging of the concepts, properties, and instances between them. The authors of [2] proposed two ideas: the core ontology and the stage ontology. The stage ontology describe the different ontologies to be integrated; the core ontology was constructed based on the integration of the ontologies in the stage ontology using different techniques, such as mapping and merging. In the research [4] was defined an ontology-based on the federated collaboration mechanism, which involves a fusion ontological strategy and a weighted approach to leverage the integration of the ontologies In [3, 6] , the authors defined an integration process of ontologies by using different engineering methods in the following order: first, an ontology mapping to establish relationships between terms of other ontologies; second, an ontology alignment that looks for connections between different ontologies, and finally, an ontology merging that gets new ontological models based on the previous steps. Notably, in research [3] , HLA was extended to integrate ontologies using different engineering methods, such as mapping, merging, and aligning. On the other hand, the authors of research [6] defined an approach to find out the semantic relations in a set of ontologies. Then, they propose an automatic semantic retrieval process to visualize the model that describes the ontological integration. This produced integration is an automatic semantic expansion that the final users in their queries could use. The research [5] proposed an approach to facilitate semantic integration by using different ontological techniques to solve two problems, the reutilization of ontologies and their heterogeneities. They defined an architecture with four layers: the presentation layer that describes the meta-information of the ontologies (language, domain, etc.); the terminology layer defines synonyms, polysemy, etc.; the concepts layer shows the structure of the ontologies (classes). Finally, the Semantic layer defines the semantic relationships, properties, etc.

Ontological Engineering refers to the activities linked to the ontology development process, including the methodologies, tools, and languages required for building ontologies [16, 33, 34, 36, 37] . One of the current main domains is Ontological Mining, which consists of extracting behavior patterns, knowledge, and other characteristics, using mining techniques to build or enrich ontologies [16] . Thus, ontological mining is the discovery of new ontology knowledge, from its concepts and relationships, including their structures, to the instances related to each concept. In a context with a high number of ontologies, ontological mining is necessary to extract global knowledge from a set of ontologies. According to [16] , the main ontological mining techniques are Ontology alignment, Ontology linking, Ontology fusing/mixing, which are described below.

The ontology alignment consists of making the comparison (matching) between the concepts of the ontologies analyzed, which is the process of finding relationships or correspondences between entities of different ontologies [16] . For that reason, ontology alignment performs semantic correspondence analysis between two or more ontologies. The technique compares ontology concepts, obtaining the relationships between entities of different ontologies. This relationship can be between other classes, individuals, properties, or formulas.

The objective of performing ontology alignments is to find relationships between the entities expressed in different ontologies to discover equivalences determined through similarity measures between these entities. The process starts with mapping between the ontology classes, applying measure similarities between them. The comparison is based on the calculation of similarity measures, which can be: linguistic (names of entities), between properties (classes), graphs (taxonomic structure), among others.

It is also called mapping and its objective is to establish identity relationships between entities of different ontologies through their common characteristics (ex: superclass_of, subclass_of) properties. The mapping result comprises a link ontology that contains the equivalent entities and properties in the ontologies, through which the two ontologies are connected.

Thus, it allows the creation of a general ontology through the integration of different linked ontologies. The identification and definition of the concepts that link the ontologies require a certain consensus. Alignment techniques allow finding the set of relationships and properties potentially used for the link between the ontologies. It may also need an expert to create new concepts that are not considered in any ontology to be linked (see Figure 1 ). 

It is also called fusing or mixing. It is a process where several ontologies within the same knowledge domain come together to standardize knowledge, make knowledge grow, or have locally complete knowledge [36, 37] . Mixing is required when ontologies handle the same domain but with different or partial representations, such that the ontologies can coincide in certain concepts and not in others. In that sense, it is necessary to integrate them (see Figure 2 ).

Classically, the mixture of ontologies implies obtaining a new ontology, considering aspects such as inconsistencies (that a relationship contradicts another relationship within the same ontology), synonyms, contradictions, and discrepancies between the ontologies. There are two types of merges: a weak merge of ontologies, where it is possible to leave ontology concepts without being mixed, or a strong merge done in two parts, a first part where the weak merge is carried out, and a second part where concepts and relationships left out are added. Some of the principles used during the merging process are described in the following [16, 36, 37] ,

• When a concept of one ontology matches with one of the concepts in the other ontology, this concept is picked up to integrate the new ontology, increasing size and enriching itself. • When a concept in an ontology is the same as in the other, but the name is uniquely different, the concepts are synonyms in the integration process. Table 1 presents the description of the ontology construction process. The Ontological Engineering Process initially defines the Tepuy-COVID ontology, which brings together ontologies of the COVID domain. This ontology is created by mixing pairs of domain ontologies, based on the alignment between them (see step 1). With this COVID domain ontology (Tepuy-COVID ontology) are linked ontologies of other domains. For that, the next procedure is repeated between Tepuy-COVID ontology and each one of other domain: first, the alignment between them, and then, the link between the ontologies (see step 2). Finally, in each step competency questions are J o u r n a l P r e -p r o o f used to validate the quality of the resulting ontology (Tepuy-COVID ontology and COVID-19 Pandemic ontology). Additionally, an essential part of the process is selecting the tools that are carrying out the ontological engineering process. Although various applications make the processes of alignment [7, 8, 9] , linking [10, 11] , and mixing [10, 11] , no one efficiently supports the three processes [12] . So, they must be manually integrated to support an entire ontological engineering process. In this research, we have chosen the Alignment API application (Align) for the alignment and mapping tasks [7] and Protégé for the mixing of the domain ontologies [10] . We develop an ontological engineering process integrating the partial results of these tools using ontological languages like OWL and RDF ( 3 . With the first two domain ontologies, we will carry out alignment and fusion processes.

The resulting ontology will be linked with the ontology of the hospital management domain, thus generating our Pandemic ontology. To validate the results, we will use competency questions and some metrics such as Precision, Recall, and F1 measure [7] , which will give us a vision of the quality of the obtained ontology.

The process begins with the entry of the CODO and COVIDCRFRAPID ontologies in the Alignment API tool [19] . For this, we selected the similarity measure to find the similar names (terms) of classes, properties, and instances. The tool offers a set of lexical and semantic similarity methods, typical of text disambiguation context, which analyze terms according to their linguistic structure (comparing terms according to the characters that make them up) [16] and their semantic domain (comparing terms against a dictionary or thesaurus) [15, 16] . In this way, the result represents the relationship strength between pairs of ontology elements assigning a value between zero and one (where one means a maximum similarity and zero that there is no similarity).

According to [17] , several similarity methods validate the alignments between different language units, as is the case of pairs of terms, which are also applicable to the context of ontology alignment. We get similar pairs of elements from ontologies using the Alignment API tool and two approaches of similarity measures: terminological matching and Linguistic-based similarity. Terminological matching considers that "same concepts are likely to be modeled using quite similar names" [27] , and it uses string-based techniques [29] to concepts comparison, similar as the case of Levenshtein and SMOA (String Metric for Ontology Alignment) measures. Linguisticbased similarity [29, 15] focuses on semantic domain terms, using distance methods and lexical databases, such as WordNet, obtaining a "lexical semantic relatedness measure that represents the strength of the semantic relationship between terms according to the shortest path between nodes in the semantic network" [18] .

The used measures were: 1. EditDistNameAlignment: it uses the Levenshtein distance between entity names to look for a similarity between pairs of terms [7, 14, 33] ; 2. SMOANameAlignment: it considers two features: the commonalities and differences between terms [26, 27] ; 3. JWNLAlignment:it computes a substring distance between the entity names of the first ontology and the entity names of the second ontology expanded with WordNet 3.0 synset [30] . Additionally, it uses the WordNet thesaurus. Table 3 presents the alignments of the CODO and COVIDCRFRAPID ontologies using the Levenshtein Distance method [13] with a threshold of 0.8 [19, 20] . Thus, the pairs of elements that have a relative strength equal to 1 correspond to exact matches in the name of terms in the two ontologies. However, as we can see in the "Type" column, they do not necessarily belong to the same ontological element (class, property, and instance). In the case of the pair [codo#VitalSigns, whocovid19crfsemdatamodel#Vital_signs], we observe that they have a relation force of 0.9 that places this pair above the established threshold (0.8). On the other hand, for the remaining pairs, which do not exceed the threshold, we observe that although the terms share a specific group of characters, semantically speaking, the domain of each one does not seem to be related 

The merging process follows the steps indicated in [10] . Thus, the CODO and COVIDCRFRAPID ontologies are analyzed from the alignment obtained with the Alignment API tool. For that, a transformation of the alignments to OWL axioms is carried out using the Alignment API Rendering method (specifically, the OWLAxiomsRendererVisitor) [19] , which generates a document that only includes the alignments found (which we will call the Alignment Ontology). Then, we create a fusion with the Protégé's merge option between the two COVID-19 ontologies and the Alignment Ontology to generate the Tepuy-COVID Ontology (see Figure 4 ). The result is a new ontology that contains the elements of the two ontologies and includes the alignments between these ontologies. Figure 5 shows some alignments detected that have been included in the mix, such as the classes "VitalSigns" (CODO) and "Vital_Signs" (COVIDCRFRAPID), which have a relationship of "equivalent classes" because their similarity exceeds the threshold of 0.8. Also, we see that the new ontology includes equivalence relations with similarities less than the threshold, as is in the case of "LaboratoryTestFinding" (CODO) and "Laboratory_question" (COVIDCRFRAPID) with a similarity measure of 0.57, which is included due to the expert opinion that considers that both classes belong to the same knowledge domain. As a result, we obtain the Tepuy-COVID Ontology (see Figure 6 ) using Protégé. Furthermore, we validate the model using the Pellet reasoner (which comes with Protégé by default), analyzing the inconsistencies that can occur during the merging process. Pellet does the following tests: class hierarchy, object property hierarchy, data property hierarchy, class assertions, object property assertions, and same individuals [28] . If no errors are J o u r n a l P r e -p r o o f found in each test, then it is considered that the ontological model is consistent, and it is possible to infer knowledge, as it can be seen in Figure 6 (see red circles) for the case of Figure 5 . 

Once we have obtained the Tepuy-COVID Ontology, we can extend the model to other domains through the Presence Ontology (PREO) ontology 4 . The process begins with the alignment of the Tepuy-COVID Ontology with the PREO ontology. Table 4 shows the alignment result using the Levenshtein Distance method (EditDistNameAlignment), but again, we use the same metric for the alignment phase thanfor the merging case. The pair's similarity must exceed the threshold (0.8). For example, pairs of properties such as "has_value" (Tepuy-COVID Ontology) and "hasValue" (PREO) represent nearby domains. In summary, the "Type" column shows the type of alignment (classes, property) found. Then, using the Alignment API tool, we transform the alignment result into a format that links the ontologies involved with their respective alignments. For this process, the tool offers some methods 5 . We use OWLAxiomsRendererVisitor, since this method delivers a linking ontology that invokes the ontologies involved and the alignments between them. In addition, it is a compatible format with Protégé, which allows displaying and using the generated ontology [19] . Figure 7 shows the alignment between Tepuy-COVID Ontology and PREO ontology and the generated linked ontology (COVID-19 Pandemic ontology). In the alignment ontology, we can observe how certain classes shared by the two origin ontologies complement their subclasses and properties in the resulting ontology. For example, the Person class of the Tepuy-COVID ontology aligns with the PREO ontology's Person class. Likewise, patient found in the two origin ontologies, it is enriched in the alignment ontology with two new subclasses, InPatient and OutPatient, which allow identifying whether the patient is hospitalized or not. Another case is the Nurse class, which became a subclass of Provider where Provider is a subclass of Health care Role, which confirms a better classification of the classes related to medical personnel according to their role. It is good to clarify that particularly in this process; the linking ontology of the two source ontologies is embedded in the alignment ontology that contains the links to the sub-trees of the source ontologies that complement the information between them. 

For the validation of our ontological models, we adapted the method described in [33] , considering three types of validations: (i) Application of competence questions: to establish the coherence of the ontology information, resulting from the ontological engineering process. (ii) Quality validation through metrics: to determine the alignment precision of concepts, relationships and individuals, that is carried out during the ontological alignment process. (iii) Component validation using Protégé: to verify the ontology consistency resulting from the ontological mixing processes, in terms of their hierarchical and axiomatic structure.

Competency questions are user-oriented questions to evaluate an ontology [42] . In other words, they are questions that users would want to have answers to by querying the ontology. Particularly, in this research, the competence questions verify the integrity of the data at the end of the mixing J o u r n a l P r e -p r o o f and linking processes. In the case of Tepuy-COVID ontology, Q1, Q2, and Q3demonstrate that the result of the information provided by the Tepuy-COVID ontology is the same as the sum of the results of each ontology used in mixing. In the same way of COVID-19 Pandemic ontology, Q4 and Q5demonstrate that the result of the information of the COVID-19 Pandemic ontology is the product of the linking of concepts present in the Tepuy-COVID ontology and the ontologies of other domains. For that, queries were executed in Protégé to determine the quality of the inferred information [33] : These competence questions are used in the experimentation of section 6.

The quality validation through metrics of the new ontology begins from the phase of alignment of ontological models. The first step evaluates the alignments of the classes, properties, and instances obtained in the alignment phase. For this, we use the Precision, Recall, and F-Measure metrics defined in [7, 21] . In order to use these measures, it is necessary to compare against a manual alignment (expert). For this purpose, two experts made the annotations of the alignments between the ontologies considered in the experimentation and the results were consolidated in a single description, using the Kappa index to solve the inconsistencies between the annotations of the experts according to the procedure indicated in [41] . Additionally, we use a threshold (0.8) to define the correspondences. In this way, we evaluate the alignments that we use both in the linking and merging processes.

Particularly, we define TP, FN, FP and TN to calculate Precision, Recall, and F-Measure, according to the comparison between the automatic alignments with the manual alignments (see an example in Table 5 ), which are explained below:

 True Positives (TP): true positives are matches that are recognized by both the manual and automatic approaches. In the case shown in the example of Table 5 , the concept Hospitalized is the same in both ontologies and, it is true in the manual approach. Now, in the automatic approach, it generates a relation strength of 1.0, which is greater than the threshold of 0.8; therefore, it is also true;  False Negatives (FN): false negatives are alignments in the manual approach but not automatically identified. In the example of Table 5 , the automatic approach generates a relation strength of 0.57; which is a false alignment because it is below the threshold of 0.8. In the manual approach, the expert considers that there is a match between the Laboratory Test Finding and Laboratory_question concepts; therefore, it determines that it is true;  False Positives (FP): the false positives are alignments falsely proposed by our automatic approach. In the example of Table 5 , the concepts Secondary Contact and Secondary Type are considered aligned (true) by the automatic approach since it indicates a relation strength of 0.81, which exceeds the threshold of 0.8. Now, in the manual approach, the expert determines that they are not aligned (false);  True negatives (TN): the true negatives are false alignments, which the automatic approach has correctly discarded. In this case, Tired and Piped's concepts are not aligned, so in manual approach is false. In the automatic approach is generated a relation strength of 0.6, lower than the threshold of 0.8; therefore, it is also false. The Precision is the ratio between TP and the sum of FP and TP; Recall as the relation between TP and the sum of FN and TP and, finally, F-Measure as the measure of the connection between the Precision and the Recall. Table 6 presents the results of the alignment validation obtained from the CODO, COVIDCRFRAPID, and PREO ontologies, using the Precision, Recall, and f-measure metrics provided by the Alignment API tool. In this case, the automatic alignment is when the Alignment API tool uses the Levenshtein measure with a threshold of 0.8. As we can see, the Precision in our automatic approach is outstanding, and the Recall and F-measure values are affected by the number of FN and FP recognized by the automatic approach. Consequently, the automatic alignment presents a recall of 0.8. That is, the measure of similarity recognizes 80% of all correct alignments (completeness) [30] .

COVID-19 Pandemic Ontology 1.00 0.80 0.89 Table 6 : Results of the alignment process

In addition to the metrics mentioned above, we evaluate the new ontology to determine its consistency. In this case, we are using the reasoner Pellet from Protégé. As we see in Figure 8 , the ontology does not show problems of inconsistencies. The classes, properties, and individuals of the three ontologies were correctly integrated into the ontology since there are no errors in the inferences made by the reasoner at the hierarchy level of classes, properties, objects, and individuals. 

This section is divided into three parts. The first part shows the merging of ontologies from the same domain (section 5.1), obtaining the Tepuy-COVID Ontology. The second part shows the linking cases with ontologies from other fields (section 5.2), getting the COVID-19 Pandemic ontology. Finally, the third part presents the validations of the COVID-19 Pandemic ontology in different cases, and analyzes the results obtained from the entire process (section 5.3). The resulting ontologies from our ontological engineering process can be found in http://bit.do/fSFy4.

For this stage, we chose five ontologies representative of the COVID-19 domain, which were all the ontologies that were available in the BioPortal repository about COVID (https://bioportal.bioontology.org/). Following our ontological engineering process, we obtained the Tepuy-COVID Ontology from merging selected ontologies with their respective classes and sources (Table 7) . It is worth mentioning that these ontologies describe different sub-domains of knowledge, aiming to create a new ontology that represents the COVID-19, including the vast majority of the characteristics or concepts related to this disease. The first step for the merging is to determine the possible alignments between the ontologies to be merged. However, since the Alignment API tool only obtains the alignments between two ontologies, and in this case, there are more than two ontologies to be merged, it is necessary to combine in pairs between all the ontologies to find all the possible alignments. Table 8 shows the alignments according to each similarity technique with a threshold higher than 0.8 for the CODO and COVIDCRFRAPID ontologies. There is an important semantic similarity between classes and properties of ontologies; in such a way, the value of 1 in the four similarity measures indicates an exact match in the name of classes and properties (for example, cases 1 to 3 of Table 8 ). Furthermore, in cases 4 to 9, the concepts have a lexical similarity (measure 2). Finally, in case 10, the semantic relationship between the concepts occurs due to the context to which they belong. The Wordnet thesaurus relates the two concepts as synonyms. In summary, alignment was achieved between classes (cases 4, 5, 7, 9 and 10), properties (case 6) and instances (cases 1, 2, 3 and 8). Figure 9 shows the process of merging the five ontologies to generate the Tepuy-COVID Ontology. As we can see, compelling cases of this process appear, for example, the addition of the Coronavirus and Coronavirus infection classes (obtained from the alignment process (see Table  5 )) to the disease caused by Coronavirus class. Also, we can see how the Nasal_prongs entity was aligned with Nasal congestion. In this way, we have achieved the Coronavirus domain model representing the union of the five ontologies that address different aspects of the disease. 

In this case, the aim is to extend the domain of the Tepuy-COVID Ontology to complementary domains, exploiting the structure (classes, properties, individuals, etc.) of the ontologies to allow the exchange of information between them. Moreover, this domain expansion allows establishing generalizations and specializations for the Tepuy-COVID Ontology. Table 9 presents a list of ontologies from other domains that are of interest for the growth of the Tepuy-COVID Ontology. At the following are instantiated two case studies in which different ontological engineering processes will be applied to show the generation of the linking between Tepuy-COVID Ontology with other domains.

 Objective: Tepuy-COVID Ontology enrichment with information related to people (Doctor, Nurse, Patient, Family members, etc.), incorporating concepts and properties such as context, personality, emotions, affectivity, education, experiences, among others. The enrichment result is reflected in the COVID-19 Pandemic Ontology.  Intervening Ontologies: Tepuy-COVID Ontology (see section 5.1) and PersonasOnto (see Figure 10 ).  Ontological Engineering Process: aligning both ontologies and thus determining relationships between their classes and properties. Table 10 shows the alignments found using the Alignment API tool with different similarity techniques and a threshold of 0.8, achieving alignment between classes (cases 1, 2, 5, 7, 8, 13, 15, 18, 19, 26, and 27) , and properties (cases 3, 4, 6, 9-12, 14, 16, 17, 23-25, 28, and 29) . Notably, there are an important number of alignments with an exact match of the names of the concepts and properties (cases 1 to 9). Cases 10 to 12 present a similarity according to three of the four measures used, where the relationships between of the concepts occur at the level of character strings (SMOA similarity measure) and linguistic similarity (3 and 4) . We highlight that the expert discards a relationship between the concepts for cases 15, 17, 20, 23, 24 In Figure 10 is 

 Objective: Tepuy-COVID Ontology enriched with information related to the providers, patients' family members, and other aspects. The enrichment result can see it in the COVID-19 Pandemic Ontology.  Intervening ontologies: Tepuy-COVID Ontology (see section 5.1.) and PREO (see Table  8 ).  Ontological Engineering Process: As in the previous case, the first step is to apply an alignment between both ontologies to determine the relationship between their classes and properties (see section 4.1). Table 11 indicates alignments found by the Alignment API using different similarity techniques and a threshold of 0.8. As we can see, most of the alignments between concepts are defined based on the measure of similarity 2 (SMOA similarity), according to the character strings that make them up. On the other hand, for cases 1 and 2, we see that the similarity measures 3 and 4 determine that the concepts are synonymous based on each thesaurus in the alignment. Finally, we proceed to generate the linking ontology (see section 4.3). Figure 11 shows the final ontology together with the two ontologies used in the linking process. In general, PREO enriches the Tepuy-COVID Ontology with many classes. Specifically, we can see in Figure 11 the following: Gender_type of Tepuy-COVID Ontology is increased with Gender of PREO ontology, acquiring the entire tree PREO's Gender subclasses, which includes classes such as Female, Male and OtherGender (GenderNonConforming and Transgender). Tepuy-COVID Ontology receives new PREO concepts, such as MaritalStatus, Race, ReligiousAffilation, SexualOrientation, SocioEconomicStatus, among many others. 

This section presents the discussion of the results of our research organized as follows: first, we will talk about the findings made during the merging process of the five domain ontologies of COVID-19 to obtain the Tepuy-COVID Ontology (section 5.1). Next, we will describe the results obtained in linking Tepuy-COVID Ontology with ontologies of other domains (see Table 9 ) and their performance results. In Figure 12 , the result of the Q2 competence question in DL Query shows three individuals who coincide with a diagnosis associated with COVID-19. In addition, on the right, it is shown all the characteristics that the individual p000001 with COVID-19 has, for example, the symptoms that it has (Fever and Upper Respiratory Tract Infection), gender (Male), among others. Table 13 presents the quality validation through metrics using Precision, Recall, and F-measure measures obtained in the alignments between the five domain ontologies of COVID-19 in Table  7 . The precision values for the four similarity measures indicate that they perform the alignments correctly since "a perfect precision score of 1.0 means that every correspondence computed by the algorithm was correct (correctness)" [30] . However, the same does not happen with the recall values of the four similarity measures. SMOA similarity technique presents a value of 0.9, that is, 90% of all correct correspondence (F-Measure of 0.95), which gives greater completeness to the alignments obtained with this measure than those identified by the Levenshtein measure. Note that SMOA analyzes commonalities and differences between concepts [26] , while Levenshtein (recall 0.40) only establishes the calculation based on the sets of similar strings between concepts [7] . On the other hand, the similarity measures based on the WordNet thesauri present a recall of 0.30 and 0.40, which shows that only 30% of the ontology concepts were found in the thesauri.

Tepuy-COVID Ontology , caused by the difference between the type of ontological element to be aligned (the first is an individual and the second is a subclass). Thus, this alignment is removed from the model to ensure the ontology's integrity. Table 14 presents the application of competence questions Q4 and Q5 to the COVID-19 Pandemic Ontology. It is observed how in all linked ontologies; the individuals of the Tepuy-COVID Ontology are maintained (see Table 12 ). Thus, the ontologies resulting from the linking of Tepuy-COVID Ontology with each of the domain ontologies do not affect the integrity of this ontology despite having acquired knowledge of these domains. 

To demonstrate the enrichment of the Tepuy-COVID Ontology with knowledge of the external PersonaOnto ontology, the following is required:  Fill the class AffectiveState-> Emotion of PersonaOnto with individuals that represent the emotions, for this specific case, the individuals Happy and Sad were added.  Relate the individuals of the Person class of Tepuy-COVID with the emotional states (hasAffectiveState) present in PersonaOnto, for this specific case, the Happy state was added to the individual p000004 and the Sad state to p000005.

In Figure 13 , two aspects are observed. 

To demonstrate the enrichment of the Tepuy-COVID ontology with knowledge of the PREO external ontology, the following is required:  Fill PREO's Environment Factor: Health care System Factor class with individuals that represent factors in the health system, for this specific case, the individuals Wheelchair and Stretcher were added.  Relate the individuals of the Person class of Tepuy-COVID with the environmental factors (hasEnvironmentFactor) present in PREO. For this specific case, the Wheelchair state was added to the individual p000003 and the Stretcher state to p000001.

In Figure 14 , two aspects are observed. On the right, there is the hasEnvironmentFactor property belonging to the PREO ontology, where the hasEnvironmentFactor is a subproperty of hasFactorByType. And on the left, the result of the execution of the competence question (Q4) is presented, which makes use of the knowledge of both linked ontologies. On the one hand, it uses the 'has diagnosis' property of Tepuy-COVID, and on the other hand, it uses the hasFactor property of PREO. The result of the query indicates that person p000001 has been diagnosed with COVID and is on Stretcher.

J o u r n a l P r e -p r o o f Table 15 shows the quality validation through metrics, starting with the results obtained from the linking processes between the Tepuy-COVID Ontology and the ontologies of other domains (Table 9 of section 5.2.) As we can see, a set of alignments is obtained (which comply with the threshold) over the total possible linking. Also, there are cases in which none of the similarity measures found possible alignments (for example, cases 1 and 3); in others, such as case 2, the alignments found are not enough because they did not exceed the threshold of 0.8.

For the other cases, it is observed that a large number of alignments were found (some repeated between methods), but few exceeded the defined threshold. For example, alignments between Tepuy-COVID Ontology and PersonasOnto exceeding a 0. Table 16 . Evaluating Alignment of the Linking Process using Alignment API On the other hand, for the consistency component validation of the ontological model, case 4 presents inconsistencies found with the Pellet Reasoner, by linking the Tepuy-COVID Ontology with DoCO ontology (codo#Business with doco#Line), due to the difference between the two concepts. In case 6, connecting Tepuy-COVID Ontology with PersonasOnto ontology in the Place and Group concepts is a different type of ontological element (see Table 10 , cases 1 and 8). Also, between Businesses with BusinessGoals (see Table 10 , case 13), it generates a conflict with the alignment of the concept Organization. In case 7, it causes inconsistencies in the alignment of the concept Contains, although the concepts are the same in the two ontologies (see Table 11 , cases 16 and 17) . Finally, in the rest of the cases where Tepuy-COVID Ontology is linked to other domains, no inconsistencies were presented (cases 1, 2, 3, 5, 8, and 9), which means that consistency is maintained in the model at the level of its classes, objects, properties, and individuals.

This research presents an ontological engineering process for integrating ontologies related to the COVID-19 disease and other contexts for the treatment and representation of the large amounts of information generated by the pandemic. First, the set of disease domain ontologies and other domains are selected, performing alignment and merging processes between the COVID ontologies and mapping processes with ontologies from different contexts. Precision, Recall, and F-Measure metrics evaluated each partial result, determining the quality and consistency of the resulting COVID-19 Pandemic Ontology. The experimentation results confirm that the proposed method guarantees an adequate ontological generation. The correctness and completeness of alignment sets by the four similarity measures reach over 60%, which are finally used in the ontological model construction process.

Concerning the Tepuy-COVID Ontology, Precision reaches 100% in recognizing all correct alignments between the five ontologies of the COVID-19 domain. Similarly, recall gets 0.90 with the SMOA similarity measure, which implies that 90% of all alignments between the ontologies have been recognized. Furthermore, the resulting model has been validated using the Pellet reasoner, obtaining that the new model is consistent in terms of classes, properties, and integrated instances. In the literature, these ontologies of the COVID-19 domain have been used to represent the information of the pandemic. But these ontologies have not been combined or related to ontologies of other fields to represent complex contexts of the pandemic. Our proposal offers the COVID-19 Pandemic Ontology that mixes five ontologies from the COVID domain with nine ontologies from different disciplines, generating a model capable of representing various types of information about the pandemic. Thus, it will allow the utilization of a large amount of information generated by the pandemic from a comprehensive perspective.

Future researches are framed in improving the experiments with the use of other semantic sources such as domain thesauri in such a way as to increase the recognition of synonymous concepts. In addition, it is necessary to test other similarity techniques based on neural networks and deep learning to increase the number of recognized alignments. Similarly, other reasoners should validate the ontological model consistency.

Another important aspect is that many of these ontological models do not contain individuals in their structure. Thus, one of the possible challenges is to enrich the ontologies with information from the different contexts involved, with the help of Linked Data and Natural Language Processing techniques for the extraction and population of the ontological model. Some formal schemes have been developed [35] , which allow automating this enrichment process of ontologies, which could be considered. Also, using approaches based on context ontologies [31, 32] is another alternative approach. In addition, we will establish new metrics that allow us to validate the resulting model according to its completeness and robustness. With this, we will evaluate the consistency of the ontology and its capacity to represent the contexts linked in the new model.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

J o u r n a l P r e -p r o o f

Ontology-based interoperation model of collaborative product development

A Multi-dimensional Ontology Model for Product Lifecycle Knowledge Management

Ontology Fusion in high-Level-Architecture-Based Collaborative Engineering Environments

Ontology maintenance in a hierarchical federated collaborative product development environment

Study of the Methods and Tools for Ontology Integration

Automatic Semantic Retrieval and Visualization Model Based on the Integrated Ontology Library

Overview of the neon toolkit

Collaborative editing of ontologies using fluent editor and ontorion

Semi-automatic merging of ontologies using protégé

Evolution of the COMA match system

Towards VocBench 3: pushing collaborative development of thesauri and ontologies further beyond

Binary codes capable of correcting deletions, insertions, and reversals

Análisis de las contradicciones en las competencias profesionales en los textos digitales usando lógica dialéctica

Determination of Professional Competencies Using an Alignment Algorithm of Academic Profiles and Job Advertisements, Based on Competence Thesauri and Similarity Measures

Introducción a la Minería Semántica

Semantic similarity from natural language and ontology analysis

Semantic similarity: a key to ontology alignment

An API for ontology alignment

Aligning business process models

Comparison of schema matching evaluations

CODO: an ontology for collection and analysis of COVID-19 data

CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis

COVID-19 Ontologies and their Applications

The Infectious Disease Ontology in the Age of COVID-19

A string metric for ontology alignment

An API for multilingual ontology matching

Pellet: A practical owl-dl reasoner

Evaluation of linguistic similarity measurement techniques for ontology alignment

Automatic ontology matching via upper ontologies: A systematic evaluation

CARMiCLOC: Context Awareness Middleware in Cloud Computing

Value of repeated fine-needle aspiration cytology and cytologic experience on the management of thyroid nodules

Using Multilayer Fuzzy Cognitive Maps to diagnose Autism Spectrum Disorder

An Approach for Multiple Combination of Ontologies Based on the Ants Colony Optimization Algorithm

Application of category theory, Ingénierie des Systèmes d'Information

An Approach for the Emerging Ontology Alignment based on the Bees Colonies

Procedure Based on Semantic Similarity for Merging Ontologies by Non-Redundant Knowledge Enrichment

The Infectious Disease Ontology in the Age of COVID-19

CODO: an ontology for collection and analysis of COVID-19 data

CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis

Method for Emotion Corpus Validation from the Consensual Identification of Patterns in Alzheimer's Patients

FOCA: A methodology for ontology evaluation, Cornell university