cord-020793-kgje01qy	2020	Laypeople''s increasing difficulties to retrieve and digest valid and relevant information in their preferred language to make health-centred decisions has motivated CLEF eHealth to organize yearly labs since 2012. substantial community interest in the tasks and their resources has led to CLEF eHealth maturing as a primary venue for all interdisciplinary actors of the ecosystem for producing, processing, and consuming electronic health information. Information access conferences have organized evaluation labs on related Electronic Health (eHealth) Information Extraction (IE), Information Management (IM), and Information Retrieval (IR) tasks for almost 20 years. This Consumer Health Search (CHS) task follows a standard IR shared challenge paradigm from the perspective that it provides participants with a test collection consisting of a set of documents and a set of topics to develop IR techniques for. The IR task at the CLEF eHealth evaluation lab 2016: usercentred health information retrieval
cord-020794-d3oru1w5	2020	title: A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling In this work, we introduce a novel over-sampling technique to address the problem of class imbalance, and propose a multi-task deep learning approach for mining suggestions from multiple domains. Experimental results on a publicly available dataset show that our over-sampling technique, coupled with the multi-task framework outperforms state-of-the-art open domain suggestion mining models in terms of the F-1 measure and AUC. In our study, we generate synthetic positive reviews till the number of suggestion and non-suggestion class samples becomes equal in the training set. All comparisons have been made in terms of the F-1 score of the suggestion class for a fair comparison with prior work on representational learning for open domain suggestion mining [5] (refer Baseline in Table 3 ). In this work, we proposed a Multi-task learning framework for Open Domain Suggestion Mining along with a novel language model based over-sampling technique for text-LMOTE.
cord-020801-3sbicp3v	2020	In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. Our model is evaluated in a zero-shot setting, meaning that we use them to predict relevance scores for query-document pairs in languages never seen during training. [28] leveraged a data set of Wikipedia pages in 25 languages to train a learning to rank algorithm for Japanese-English and Swahili-English cross-language retrieval. In particular, to circumvent the lack of training data, we leverage transfer learning techniques to train Arabic, Mandarin, and Spanish retrieval models using English training data. We evaluate our models in a zero-shot setting; that is, we use them to predict relevance scores for query document pairs in languages never seen during training. Because large-scale relevance judgments are largely absent in languages other than English, we propose a new setting to evaluate learning-to-rank approaches: zero-shot cross-lingual ranking.
cord-020806-lof49r72	2020	title: Novel and Diverse Recommendations by Leveraging Linear Models with User and Item Embeddings In this paper, we present EER, a linear model for the top-N recommendation task, which takes advantage of user and item embeddings for improving novelty and diversity without harming accuracy. In this paper, we propose a method to augment an existing recommendation linear model to make more diverse and novel recommendations, while maintaining similar accuracy results. Experiments conducted on three datasets show that our proposal outperforms the original model in both novelty and diversity while maintaining similar levels of accuracy. On the other side, as results in Table 3 show, ELP is able to provide good figures in novelty and diversity, thanks to the embedding model capturing non-linear relations between users and items. It is common in the field of recommender systems for methods with lower accuracy to have higher values in diversity and novelty. FISM: factored item similarity models for top-n recommender systems
cord-020808-wpso3jug	2020	In order to guide researchers through the process of managing their data, many funding agencies (e.g. the National Science Foundation (NSF), the European Commission (EC), or the FundaÃ§Ã£o para a CiÃªncia e Tecnologia (FCT) have created and published their own open access policies, as well as requiring that any grant proposals be accompanied by a Data Management Plan (DMP). The DMP is a document describing the techniques, methods and policies on how data from a research project is to be created or collected, documented, accessed, preserved and disseminated. The second part comprises of the execution of the following four tasks and results in both the collection of the necessary mappings between the ontology and the identified DMP templates, and creation of DL queries based on the funders'' requirements. The DMP Common Standard Ontology (DCSO) 1 , was created with the objective of providing an implementation of the DMP Common Standards model expressed through the usage of semantic technology, which has been considered a possible solution in the data management and preservation domains [9] .
cord-020811-pacy48qx	2020	title: Incremental Approach for Automatic Generation of Domain-Specific Sentiment Lexicon To this end, we propose an approach to automatically generate a domain-specific sentiment lexicon using a vector model enriched by weights. Although research has been carried out on corpus-based approaches for automatic generation of a domain-specific lexicon [1, 4, 5, 7, 9, 10, 14] , existing approaches focused on creation of a lexicon from a single corpus [4] . To this end, this work proposes an incremental approach for the automatic generation of a domain-specific sentiment lexicon. We aim to investigate an incremental technique for automatically generating domain-specific sentiment lexicon from a corpus. Can we automatically generate a sentiment lexicon from a corpus and improves the existing approaches? After detecting the domain shift, we merge the distribution using a similar approach discussed (in updating using the same corpus) and generate the lexicon.
cord-020813-0wc23ixy	2020	Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensive relevance judgments. Despite the widely-known importance of studying answer passage retrieval for non-factoid questions [1, 2, 8, 18] , the research progress for this task is limited by the availability of high-quality public data. Although WikiPassageQA is an invaluable contribution to the community, it does not cover all aspects of the non-factoid question answering task and has the following limitations: (i) it only contains an average of 1.7 relevant passages per question and does not cover many questions with multiple correct answers; (ii) it was created from the Wikipedia website, containing only formal text; (iii) more importantly, the questions in the WikiPassageQA dataset were generated by crowdworkers, which is different from the questions that users ask in real-world systems; (iv) the relevant passages in WikiPassageQA contain the answer to the question in addition to some surrounding text. In contrast, ANTIQUE provides a reliable collection with complete relevance annotations for evaluating non-factoid QA models.
cord-020814-1ty7wzlv	2020	In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Graph Convolutional Networks (GCN) [7, 9] , which have been recently become increasingly popular, are at the core of state-of-the-art methods for entity alignments in KGs [3, 6, 22, 24, 27] . 1. We investigate the reproducibility of the published results of a recent GCNbased method for entity alignment and uncover differences between the method''s description in the paper and the authors'' implementation. Overview of used datasets with their sizes in the number of triples (edges), entities (nodes), relations (different edge types) and alignments. GCN-Align [22] is a GCN-based approach to embed all entities from both graphs into a common embedding space. Semi-supervised entity alignment via knowledge graph embedding with awareness of degree difference Entity alignment between knowledge graphs using attribute embeddings
cord-020815-j9eboa94	2020	Experiments on three newswire collections show that there are no significant effectiveness differences between them, including Lucene''s often maligned approximation of document length. Although learning-to-rank approaches and neural ranking models are widely used today, they are typically deployed as part of a multi-stage reranking architecture, over candidate documents supplied by a simple term-matching method using traditional inverted indexes [1] . Our goal is a large-scale reproducibility study to explore the nuances of different variants of BM25 and their impact on retrieval effectiveness. Their findings are confirmed: effectiveness differences in IR experiments are unlikely to be the result of the choice of BM25 variant a system implemented. We implemented a variant that uses exact document lengths, but is otherwise identical to the Lucene default. Storing exact document lengths would allow for different ranking functions to be swapped at query time more easily, as no information would be discarded at index time.
cord-020820-cbikq0v0	2020	To this end, in this paper we elaborate on the idea of leveraging the available antonyms of the original query terms for eventually producing an answer which provides a better overview of the related conceptual and information space. In this paper we elaborate on the idea of leveraging the available antonyms of the original query terms (if they exist), for eventually producing an answer which provides a better overview of the related information and conceptual space. In their comments for these queries, users mention that the selected (i.e., dual) list "provides a more general picture" and "more relevant and interesting results, although contradicting". For the future, we plan to define the appropriate antonyms selection algorithms and relevance metrics, implement the proposed functionality in a meta-search setting, and conduct a large scale evaluation with real users over exploratory tasks, to identify in which queries the dual approach is beneficial and to what types of users.
cord-020830-97xmu329	2020	We show that these monolingual models trained separately on different languages using multilingual word representation or text-based features can open the door to irony detection in languages that lack of annotated data for irony. We aim here to bridge the gap by tackling ID in tweets from both multilingual (French, English and Arabic) and multicultural perspectives (Indo-European languages whose speakers share quite the same cultural background vs. We can justify that by, the language presentation of the Arabic and French tweets are quite informal and have many dialect words that may not exist in the pretrained embeddings we used comparing to the English ones (lower embeddings coverage ratio), which become harder for the CNN to learn a clear semantic pattern. The CNN architecture trained on cross-lingual word representation shows that irony has a certain similarity between the languages we targeted despite the cultural differences which confirm that irony is a universal phenomena, as already shown in previous linguistic studies [9, 24, 35] .
cord-020832-iavwkdpr	2020	title: ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents ChEMU involves two key information extraction tasks over chemical reactions from patents. In this paper, we propose a new evaluation lab (called ChEMU) focusing on information extraction over chemical reactions from patents. Our goals are: (1) To develop tasks that impact chemical research in both academia and industry, (2) To provide the community with a new dataset of chemical entities, enriched with relational links between chemical event triggers and arguments, and (3) To advance the state-of-the-art in information extraction over chemical patents. The ChEMU lab at CLEF-2020 1 offers the two information extraction tasks of Named entity recognition (Task 1) and Event extraction (Task 2) over chemical reactions from patent documents. ChEMU will focus on two new tasks of named entity recognition and event extraction over chemical reactions from patents.
cord-020834-ch0fg9rp	2020	We share the story of how an innovation that originated from academia-blockmax indexes and the corresponding block-max Wand query evaluation algorithm of Ding and Suel [6] -made its way into the open-source Lucene search library. We see this paper as having two main contributions beyond providing a narrative of events: First, we report results of experiments that attempt to match the original conditions of Ding and Suel [6] and present additional results on a number of standard academic IR test collections. 3 Support for block-max indexes was the final feature that was implemented, based on the developers'' reading of the paper by Ding and Suel [6] , which required invasive changes to Lucene''s index format. The story of block-max Wand in Lucene provides a case study of how an innovation that originated in academia made its way into the world''s most widely-used search library and achieved significant impact in the "real world" through hundreds of production deployments worldwide (if we consider the broader Lucene ecosystem, which includes systems such as Elasticsearch and Solr).
cord-020835-n9v5ln2i	2020	
cord-020841-40f2p3t4	2020	In this paper we look beyond metrics-based evaluation of Information Retrieval systems, to explore the reasons behind ranking results. We present the content-focused Neural-IR-Explorer, which empowers users to browse through retrieval results and inspect the inner workings and fine-grained results of neural re-ranking models. The explorer complements metrics based evaluation, by focusing on the content of queries and documents, and how the neural models relate them to each other. Users can explore each query result in more detail: We show the internal partial scores and content of the returned documents with different highlighting modes to surface the inner workings of a neural re-ranking model. The explorer displays data created by a batched evaluation run of a neural re-ranking model. Additionally, the Neural-IR-Explorer also illuminates the pool bias [12] of the MSMARCO ranking collection: The small number of judged documents per query makes the evaluation fragile. We presented the content-focused Neural-IR-Explorer to complement metric based evaluation of retrieval models.
cord-020843-cq4lbd0l	2020	This paper presents a pipeline for document and passage retrieval for biomedical question answering built around a new variant of the DeepRank network model in which the recursive layer is replaced by a self-attention layer combined with a weighting mechanism. On the other hand, models such as the Deep Relevance Matching Model (DRMM) [3] or DeepRank [10] follow a interaction-based approach, in which matching signals between query and document are captured and used by the neural network to produces a ranking score. The main contribution of this work is a new variant of the DeepRank neural network architecture in which the recursive layer originally included in the final aggregation step is replaced by a self-attention layer followed by a weighting mechanism similar to the term gating layer of the DRMM. The proposed model was evaluated on the BioASQ dataset, as part of a document and passage (snippet) retrieval pipeline for biomedical question answering, achieving similar retrieval performance when compared to more complex network architectures.
cord-020846-mfh1ope6	2020	
cord-020848-nypu4w9s	2020	Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. Born-digital and educational images need further benchmarks on challenging information retrieval tasks in order to test generalization. While document scans and born-digital educational illustrations have materially different appearance, these papers show that the utility of deep neural networks is not limited to scene image tasks (Fig. 1) . The related DocFigure dataset covers similar images and has much more data than SlideImages. In this paper, we have presented the task of classifying educational illustrations and images in slides and introduced a novel dataset SlideImages.
cord-020851-hf5c0i9z	2020	
cord-020871-1v6dcmt3	2020	
cord-020872-frr8xba6	2020	
cord-020875-vd4rtxmz	2020	To address this problem, I aim to exploit different techniques such as training neural models, enriching the tweet representation, and studying methods to mitigate the lack of labeled data. In my work, I am interested in tackling the Location Mention Prediction (LMP) problem during time-critical situations. The location taggers have to address many challenges including microblogging-specific challenges (e.g., tweet sparsity, noisiness, stream rapid-changing, hashtag riding, etc.) and the task-specific challenges (e.g., time-criticality of the solution, scarcity of labeled data, etc.). Alternatively, Sultanik and Fink [25] , used Information Retrieval (IR) based approach to identify the location mentions in tweets. Moreover, Hoang and Mothe [8] combined syntactic and semantic features to train traditional ML-based models whereas Kumar and Singh [13] trained a Convolutional Neural Network (CNN) model that learns the continuous representation of tweet text and then identifies the location mentions.
cord-020880-m7d4e0eh	2020	Task 3 asks to retrieve text snippets from a given set of Web pages that would be useful for verifying a target tweet''s claim. Finally, the lab offers a fifth task that asks to predict the check-worthiness of the claims made in English political debates and speeches. Task 3 is defined as follows: Given a check-worthy claim on a specific topic and a set of text snippets extracted from potentially-relevant webpages, return a ranked list of all evidence snippets for the claim. Once we acquire annotations for Task 1, we share with participants the Web pages and text snippets from them solely for the check-worthy claims, which would enable the start of the evaluation cycle for Task 3. Task 4 is defined as follows: Given a check-worthy claim on a specific topic and a set of potentially-relevant Web pages, predict the veracity of the claim.
cord-020885-f667icyt	2020	In this work, we present an approach that uses semantically meaningful, bimodal random walks on real-world heterogeneous networks to extract correlations between nodes and bring together nodes with shared or similar attributes. In this work, -We propose a novel method that incorporates restaurants and their attributes into a multimodal graph and extracts multiple, bimodal low dimensional representations for restaurants based on available paths through shared visual, textual, geographical and categorical features. In this section, we discuss prior work that leverages graph-based structures for extracting information from multiple modalities, focussing on the auto-captioning task that introduced such methods. For each of these sub-networks, we perform random walks and use a variant of the heterogeneous skip-gram objective introduced in [6] to generate low-dimensional bimodal embeddings. Our attention-based model combines separately learned bimodal embeddings using a late-fusion setup for predicting the review volume of the restaurants.
cord-020888-ov2lzus4	2020	While most of the current approaches for cross-modal retrieval revolve around learning how to represent text and images in a shared latent space, we take a different direction: we propose to generalize the cross-modal relevance feedback mechanism, a simple yet effective unsupervised method, that relies on standard information retrieval heuristics and the choice of a few hyper-parameters. The model can be understood very simply: similarly to PRF methods in standard information retrieval, the goal is to boost images that are visually similar to top images (from a text point of view), i.e. images that are likely to be relevant to the query but were initially badly ranked (which is likely to happen in the web scenario, where text is crawled from source page and can be very noisy).
cord-020890-aw465igx	2020	
cord-020891-lt3m8h41	2020	
cord-020896-yrocw53j	2020	We present MEMIS, a system that can be used in emergencies like disasters to identify and analyze the damage indicated by user-generated multimodal social media posts, thereby helping the disaster management groups in making informed decisions. To this end, we propose MEMIS, a multimodal system capable of extracting information from social media, and employs both images and text for identifying damage and its severity in real-time (refer Sect. Therefore, we effectively have three models for each modality: first for filtering the informative tweets, then for those pertaining to the infrastructural damage (or any other category related to the relief group), and finally for assessing the severity of damage present. Similarly, if at least one of the text and the image modality predicts an informative tweet as containing infrastructural damage, the tweet undergoes severity analysis. Here, we use attention fusion to combine the feature interpretations from the text and image modalities for the severity analysis module [12, 26] .
cord-020899-d6r4fr9r	2020	In this paper, we present a conditional GAN with two generators and a common discriminator for multiview learning problems where observations have two views, but one of them may be missing for some of the training samples. We address the problem of multiview learning with Generative Adversarial Networks (GANs) in the case where some observations may have missing views without there being an external resource to complete them. We demonstrate that generated views allow to achieve state-of-the-art results on a subset of Reuters RCV1/RCV2 collections compared to multiview approaches that rely on Machine Translation (MT) for translating documents into languages in which their versions do not exist; before training the models. 3.2); -Achieve state-of-the art performance compared to multiview approaches that rely on external view generating functions on multilingual document classification; and which is another challenging application than image analysis which is the domain of choice for the design of new GAN models (Sect.
cord-020901-aew8xr6n	2020	
cord-020903-qt0ly5d0	2020	task-based sessions represent significantly different background contexts to be used in the perspective of better understanding users'' query reformulations. Using insights from large-scale search logs, our findings clearly show that task is an additional relevant search unit that helps better understanding user''s query reformulation patterns and predicting the next user''s query. To design support processes for task-based search systems, we argue that we need to: (1) fully understand how user''s task performed in natural settings drives the query reformulations changes; and (2) gauge the level of similarity of these changes trends with those observed in time-based sessions. With this in mind, we perform large-scale log analyses of users naturally engaged in tasks to examine query reformulations from both the time-based session vs. To identify query reformulation patterns, most of the previous works used large-scale log analyses segmented into time-based sessions.
cord-020904-x3o3a45b	2020	The primary goal of the proposed model is to combine the exact and semantic matching between query and document terms, which has been shown to produce effective performance in information retrieval. In basic retrieval models such as BM25 [30] and the language modeling framework [29] , the relevance score of a document is estimated based on explicit matching of query and document terms. Finally, our proposed model for relevance ranking provides the basis for natural integration of semantic term matching and local document context analysis into any retrieval model. [13] proposed a generalized estimate of document language models using a noisy channel, which captures semantic term similarities computed using word embeddings. Note that in this experiment, we only consider methods that select expansion terms based on word embeddings and not other information sources such as the top retrieved documents for each query (PRF).
cord-020905-gw8i6tkn	2020	To model such customer expectations and capture important information from a review text, we propose a novel neural network which leverages review sentiment and product information. In order to address the above issues, we propose a novel neural network architecture to introduce sentiment and product information when identifying helpful content from a review text. In the cold start scenario, our proposed model demonstrates an AUC improvement of 5.4% and 1.5% on Amazon and Yelp data sets, respectively, when compared to the state of the art model. From Table 5 , we see that adding a sentiment attention layer (HSA) to the base model (HBiLSTM) results in an average improvement in the AUC score of 2.0% and 2.6%, respectively on the Amazon and Yelp data sets. In this paper, we describe our analysis of review helpfulness prediction and propose a novel neural network model with attention modules to incorporate sentiment and product information.
cord-020908-oe77eupc	2020	
cord-020909-n36p5n2k	2020	-the bias goggles model for computing the bias characteristics of web domains for a user-defined concept, based on the notions of Biased Concepts (BCs), Aspects of Bias (ABs), and the metrics of the support of the domain for a specific AB and BC, and its bias score for this BC, -the introduction of the Support Flow Graph (SFG), along with graph-based algorithms for computing the AB support score of domains, that include adaptations of the Independence Cascade (IC) and Linear Threshold (LT) propagation models, and the new Biased-PageRank (Biased-PR) variation that models different behaviours of a biased surfer, -an initial discussion about performance and implementation issues, -some promising evaluation results that showcase the effectiveness and efficiency of the approach on a relatively small dataset of crawled pages, using the new AGBR and AGS metrics, -a publicly accessible prototype of bias goggles.
cord-020912-tbq7okmj	2020	We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. More concretely, we incorporate the global context information encoded in the entire text sequence (through the attention mechanism) into a variational autoencoder (VAE) at each time step, which converts the input text into an image representation in the image embedding space. To capture the semantics of the images retrieved so far (in a story/recipe), we assume the prior of the distribution of the topic given the text input follows the distribution conditional on the latent topic from the previous time step. -We propose a new variational recurrent seq2seq (VRSS) retrieval model for seq2seq retrieval, which employs temporally-dependent latent variables to capture the sequential semantic structure of text-image sequences. Our work is related to: cross-modal retrieval, story picturing, variational recurrent neural networks, and cooking recipe datasets.
cord-020914-7p37m92a	2020	From an information retrieval perspective, an interesting task within this setting is finding the best supporting and attacking premises for a given query claim from a large corpus of arguments. From an information retrieval perspective, an interesting task within this setting is finding the best supporting (pro) and attacking (con) premises for a given query claim [31] . Given a user''s keyword query, the system retrieves, ranks, and presents premises supporting and attacking the query, taking similarity of the query with the premise, its corresponding claim, and other contextual information into account. We assume that we work with a large corpus of argumentative text, for example collections of political speeches or forum discussions, that has already been mined and transferred into claims with the corresponding premises and stances. We consider the following problem: Given a controversial claim or topic, for example "We should abandon fossil fuels", a user searches for the most important premises from the corpus supporting or attacking it.
cord-020916-ds0cf78u	2020	The main contributions of this study can be summarized as follows: (a) We introduce the Seed-guided Deep Document Clustering (SD2C) framework, 1 the first attempt, to the best of our knowledge, to constrain clustering with seed words based on a deep clustering approach; and (b) we validate this framework through experiments based on automatically selected seed words on five publicly available text datasets with various sizes and characteristics. The constrained clustering problem we are addressing in fact bears strong similarity with the one of seed-guided dataless text classification, which consist in categorizing documents based on a small set of seed words describing the classes/clusters. This can be done by enforcing that seed words have more influence either on the learned document embeddings, a solution we refer to as SD2C-Doc, or on the cluster representatives, a solution we refer to as SD2C-Rep. Note that the second solution can only be used when the clustering process is based on cluster representatives (i.e., R = {r k } K k=1 with K the number of clusters), which is indeed the case for most current deep clustering methods [1] .
cord-020918-056bvngu	2020	
cord-020927-89c7rijg	2020	
cord-020931-fymgnv1g	2020	
cord-020932-o5scqiyk	2020	In text similarity search, query processing can be accelerated through dynamic pruning [18] , which typically estimates score upperbounds to prune documents unlikely to be in the top K results. As a result, the posting list entry also stores the root node ID for indexed paths, in order to reconstruct matches substructures at merge time. Define partial upperbound matrix W = {w i,j } |Tq|Ã|T| where T = {T(m), m â T q } are all the token paths from query OPT (T is essentially the same as tokenized P(T q )), and a binary variable x |T|Ã1 indicating which corresponding posting lists are placed in the non-requirement set. We have presented rank-safe dynamic pruning strategies that produce an upperbound estimation of structural similarity in order to speedup formula search using subtree matching. Our dynamic pruning strategies and specialized inverted index are different from traditional linear text search pruning methods and they further associate query structure representation with posting lists.
cord-020936-k1upc1xu	2020