key: cord-0125879-h57v9517 authors: Brack, Arthur; Hoppe, Anett; Ewerth, Ralph title: Citation Recommendation for Research Papers via Knowledge Graphs date: 2021-06-10 journal: nan DOI: nan sha: d4a4003d55f651b4305add668410f1978ce2940f doc_id: 125879 cord_uid: h57v9517 Citation recommendation for research papers is a valuable task that can help researchers improve the quality of their work by suggesting relevant related work. Current approaches for this task rely primarily on the text of the papers and the citation network. In this paper, we propose to exploit an additional source of information, namely research knowledge graphs (KG) that interlink research papers based on mentioned scientific concepts. Our experimental results demonstrate that the combination of information from research KGs with existing state-of-the-art approaches is beneficial. Experimental results are presented for the STM-KG (STM: Science, Technology, Medicine), which is an automatically populated knowledge graph based on the scientific concepts extracted from papers of ten domains. The proposed approach outperforms the state of the art with a mean average precision of 20.6% (+0.8) for the top-50 retrieved results. Citations are a core part of research articles as they enable the reader to position the novel contribution in the scientific context. Moreover, relating own contributions with relevant research via references can also improve visibility. In consequence, it is in the interest of authors to provide complete and high-quality citation links to existing research. However, this task becomes ever more complicated since the number of published research articles has been growing exponentially in the recent years [5] . Consequently, the recommendation of suitable references for a piece of scientific writing is an important task to (a) improve the quality of future publications, (b) help authors and reviewers to point out additional relevant related work, and (c) discover interesting links to other areas of research. Färber and Jatowt [14] distinguish between (1) local citation recommendation which aims to provide citations for a short passage of text, and global citation recommendation which uses the documents' full text or abstract as the input. Here, we focus on the task of global citation recommendation. Current best-performing approaches for global citation recommendation [4, 9, 32] leverage primarily the articles' text and the citation network as information sources. In this paper, we explore another source of information, that is the set of scientific concepts which are mentioned in the article. The assumptions are (1) that additionally to the article's text, these provide condensed evidence to the described problem statement, used methodology or evaluation metrics, and (2) that research papers which should be citing each other usually share a similar set of concepts. Consequently, we investigate whether research KGs interconnecting research papers based on the mentioned scientific concepts are instrumental in improving citation recommendation. For this purpose, we propose an approach which combines automatically extracted scientific concepts from the research articles with existing approaches for citation recommendation. The approach is evaluated on a KG that has been automatically populated from papers of ten scientific domains [6] . Our experimental results demonstrate that our proposed approach consistently improves the state of the art with a MAP@50 (mean average precision of top-50 results) of 20.6% (+0.8). To facilitate further research, we release all our corpora and source code: https://github.com/arthurbra/citation-recommendation-kg. The remaining of the paper is organised as follows: Section 2 reviews existing research KGs and approaches for citation recommendation. In Section 3 we describe our proposed approach. The experimental setup and results are reported in Section 4 and 5, while Section 6 concludes the paper and outlines future work. Here, we briefly review research KGs and approaches for citation recommendation. Various KGs interlink research papers through metadata (e.g. authors, venues) and citations [13, 22] , or through research artefacts (e.g. datasets) [1, 23] . Other initiatives organise scientific knowledge in a structured manner with community effort, such as Gene Ontology [10] , WikiData [28] with encyclopaedic knowledge, or Papers With Code [24] and Open Research Knowledge Graph [16] for research contributions. Furthermore, various KGs have been populated automatically from research articles. Computer Science Ontology (CSO) is a taxonomy for computer science research areas [27] . Kannan et al. [19] create a multimodal KG for deep learning papers from text and images and the corresponding source code. The AI-KG has been generated from 333,000 research papers in the field of artificial intelligence (AI) [11] . It contains five concept types (tasks, methods, metrics, materials, others) linked by 27 relations types. The COVID-19 KG [30] has been populated from the Covid-19 Open Research Dataset [29] and contains various biological concept entities. Brack et al. [6] generate a KG for ten science domains with the concept types material, method, process, and data. In the following, we outline recent approaches for global citation recommendation. For local recommendation, we refer to the survey of Färber and Jatowt [14] . Bhagavatula et al. [4] propose a neural network-based document embedding model to retrieve candidate documents for a query document via similarity search [18] and a ranking model to rerank the top-k candidates. The document embedding model is trained via a triplet loss with the papers' abstract and title using a Siamese architecture. It learns a high cosine similarity between document embeddings of papers citing each other. The reranker estimates the probability that a query document should cite a candidate document using the abstract, title, and optional metadata (e.g. author, venue) as features. Cohan et al. [9] propose a document embedding model named SPECTER (Scientific Paper Embeddings using Citationinformed TransformERs). The SPECTER model is trained with an approach similar to Bhagavatula et al. [4] . However, they use a BERT encoder [12] pre-initialised with SciBERT embeddings [3] . Furthermore, Cohan et al. omit the reranking step and obtain the ranked results directly via the document embeddings' cosine similarity. Graph-based representation learning approaches learn document embeddings via graph convolution networks on the citation graph [15, 20, 31] . However, they require the citation network also at inference time. Other approaches [7, 17, 32] frame citation recommendation as a binary classification task: given a query and a candidate paper, the model learns to predict whether the query paper should cite the candidate paper. The models learn rich relationships between the contents of the two documents via various cross-document attention mechanisms. However, in contrast to the document embedding models [4, 9] , such binary classification models can not be used for retrieval but only for reranking the top k results, since a query paper has to be compared with all other documents [8] . To the best of our knowledge, approaches for citation recommendation that exploit knowledge graphs with scientific concepts have not been proposed yet. As the discussion of related work shows, citation recommendation approaches have not exploited research KGs yet. To leverage research knowledge graphs for citation recommendation, we propose an approach to combine document embeddings learned from textual content and the citation graph together with scientific concepts mentioned in the document. Let KG = (D, E, V ) be a KG, D the set of documents, E the set of concepts, V ⊆ D × E the set of links between papers and concepts, and E d ⊆ E the set of concepts mentioned in paper d ∈ D. Let one hot(e i ) ∈ R |E| be the one-hot vector for concept e i in which the i-th component equals 1 and all remaining components are 0. Now, we construct the concept vector c d ∈ R |E| for paper d as follows: Furthermore, let s d be a document embedding of paper d obtained via an existing document embedding model (e.g. SPECTER [9] ). The vector representation d of paper d is the concatenation of the concept vector c d and the document embedding s d : For a query paper q ∈ D the task is to retrieve the top k results such that papers to be cited appear at the top of the list. We use cosine similarity for retrieval and ranking: In this section, we describe the experimental setup, i.e. the used benchmark dataset, baseline approaches, and the evaluation procedure. Benchmark Dataset: Existing benchmark datasets for research paper citation recommendation (e.g. [4, 9, 22] ) do not provide a research KG that interlinks papers with scientific concepts. Therefore, we use the STM-KG [6] as our benchmark dataset whose characteristics are depicted in Table 1 . It has been populated from 55,485 abstracts in ten different scientific, technical, and medical domains and comes in two variants: (1) in-domain KG that shares scientific concepts only between papers of the same domain to avoid ambiguity of scientific terms (e.g. neural network in medicine vs. computer science), and (2) cross-domain KG that shares scientific concepts also between domains. The KG contains 15,395 citation links within the KG in total, of which 2,200 citation links are across papers from different domains. For evaluation, analogous to related work [4, 9] , we use only papers that cite at least four papers within the KG which results in 720 query documents and 4,069 citations links. In contrast to Cohan et al. [9] , we pursue a realistic approach like Bhagavatula et al. [4] , i.e. we retrieve top-k documents from all documents in the corpus instead of using predefined candidate sets of 30 documents (5 cited and 25 uncited papers) for each query document. Baseline Approaches: We compare our approach with two simple (1 & 2) and three strong baselines (3, 4 & Only the concept vector is used for ranking (see Equation 1 ). 3. GloVe: Document embedding of a paper is the average of GloVe [25] word embeddings obtained from the abstract of the paper. 4. SciBERT: Document embedding is also the average of the contextual word embeddings obtained from the abstract of the paper via SciBERT [3] that is based on BERT [12] and has been pre-trained on scientific text. It has demonstrated superior performance in various downstream tasks on research papers [3] . 5. SPECTER: Document embedding is obtained via SPECTER [9] from the title and the abstract. The SPECTER model has been trained on the textual content and the citation graph of research papers, and is the current state of the art. To compute GloVe and SciBERT document embeddings, we use the sentence transformers library [26] . For SPECTER we use the implementation of Cohan et al. [9] . Evaluation: To evaluate the quality of the ranking results for the top k citation recommendations, we use Mean Average Precision (MAP@k) [2, 21] as in related work [9] . MAP@k is the mean of the Average Precision at k (AP@k) scores over the query documents. The metric AP@k assumes that a user is interested in finding many relevant documents and is thus an appropriate evaluation metric for citation recommendation: P recision@k is the fraction of relevant documents among the top k retrieved documents, and rel(k) equals 1 if the document at position k is relevant, 0 otherwise. The boxplots in Figure 1 depict the distribution of cosine similarities of concept vectors between citing and non-citing papers. It can be seen that papers citing each other have 17.0 (+0.5) 19.0 (+0.7) 20.6 (+0. 8) on average a higher cosine similarity than papers not citing each other. This underlines our hypothesis that papers citing each other share a common set of scientific concepts. Table 2 shows the results of the evaluated approaches. Using only the concept vectors for ranking outperforms the random baseline significantly. When using only certain concept types (i.e. process, method, material, or data) we can observe that material and process concept types contribute most to the results. However, using all concept types together yields the best results. Baseline ranking approaches via document embeddings learned from the text (GloVe and SciBERT), or text and the citation graph (SPECTER) outperform the ranking only via concept vectors significantly, while SPECTER performs best as expected. This indicates that concept vectors alone do not contain enough information for the task of citation recommendation. However, our proposed approach combining document embeddings and concept vectors consistently improves all baseline approaches. For SPECTER, the in-domain KG yields slightly better results than the cross-domain KG. However, in our error analysis we found out that concept vectors from the cross-domain KG provide more accurate rankings for cross-domain citations. Our results indicate that the exploitation of a research KG as an additional source of information improves the task of citation recommendation. In this paper, we have investigated whether an automatically populated research KG can enhance the task of citation recommendation. For this purpose, we have combined document embeddings that have been learned from text and the citation graph together with concept vectors representing scientific concepts mentioned in a paper. The experimental results demonstrate that the concept vectors provide meaningful features for the task of citation recommendation. In future work, we plan to evaluate our approach on further research KGs and develop approaches that can learn document embeddings jointly from text, the citation graph, and the research KG. Research graph: Building a distributed graph of scholarly works using research data switchboard Rank eval: Blazing fast ranking evaluation metrics in python SciBERT: A pretrained language model for scientific text Content-based citation recommendation Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references Coreference resolution in research papers from multiple domains Cross-document language modeling Pre-training tasks for embeddingbased large-scale retrieval SPECTER: document-level representation learning using citation-informed transformers The gene ontology resource: 20 years and still going strong AI-KG: an automatically generated knowledge graph of artificial intelligence BERT: pre-training of deep bidirectional transformers for language understanding The microsoft academic knowledge graph: A linked data source with 8 billion triples of scholarly data Citation recommendation: approaches and datasets Inductive representation learning on large graphs Open research knowledge graph: A system walkthrough Semantic text matching for long-form documents Billion-scale similarity search with gpus Multimodal knowledge graph for deep learning papers and code Semi-supervised classification with graph convolutional networks Mean Average Precision S2ORC: the semantic scholar open research corpus The openaire research graph data model Glove: Global vectors for word representation Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing The computer science ontology: A comprehensive automatically-generated taxonomy of research areas Wikidata: a free collaborative knowledgebase COVID-19 knowledge graph: Accelerating information retrieval and discovery for scientific literature. CoRR abs Simplifying graph convolutional networks Multilevel text alignment with cross-document attention