key: cord-0594663-2fzprnuu authors: Min, Bonan; Rozonoyer, Benjamin; Qiu, Haoling; Zamanian, Alexander; MacBride, Jessica title: ExcavatorCovid: Extracting Events and Relations from Text Corpora for Temporal and Causal Analysis for COVID-19 date: 2021-05-05 journal: nan DOI: nan sha: 668e3d7b8f049c67ef8dfc85b3e3022bc03930cd doc_id: 594663 cord_uid: 2fzprnuu Timely responses from policy makers to mitigate the impact of the COVID-19 pandemic rely on a comprehensive grasp of events, their causes, and their impacts. These events are reported at such a speed and scale as to be overwhelming. In this paper, we present ExcavatorCovid, a machine reading system that ingests open-source text documents (e.g., news and scientific publications), extracts COVID19 related events and relations between them, and builds a Temporal and Causal Analysis Graph (TCAG). Excavator will help government agencies alleviate the information overload, understand likely downstream effects of political and economic decisions and events related to the pandemic, and respond in a timely manner to mitigate the impact of COVID-19. We expect the utility of Excavator to outlive the COVID-19 pandemic: analysts and decision makers will be empowered by Excavator to better understand and solve complex problems in the future. An interactive TCAG visualization is available at http://afrl402.bbn.com:5050/index.html. We also released a demonstration video at https://vimeo.com/528619007. Timely responses from policy makers to mitigate the impact of the COVID-19 pandemic rely on a comprehensive grasp of events, their causes, and their impacts. Since the beginning of the COVID-19 pandemic, an enormous amount of articles are being published every day, that report many events 1 related to COVID as well as studies related to COVID. It is very difficult, if not impossible, to keep track of these developing events or to get a comprehensive overview of the temporal and causal dynamics underlying these events. To aid the policy makers in overcoming the information overload, we developed ExcavatorCovid (or Excavator for short), a system that will ingest open-source text sources (e.g., news articles and scientific publications), extract COVID-19 related events and relations between them, and build a Temporal and Causal Analysis Graph (TCAG). Excavator combines the following NLP techniques: • Extracting events ( §3) for types in our comprehensive COVID-19 event taxonomy ( §2). Each event will have time and location if available in text, allowing analyses targeted at specific times or geographic regions of interest. • Extracting three types of temporal and causal relations ( §4) between pairs of events. • Constructing a TCAG ( §5) by assembling all events and relations, to provide a comprehensive overview of the events related to COVID-19 as well as their causes and impacts. • Supporting trend and correlation analysis of events, via visualizing event popularity time series ( § 6) in the TCAG visualization. Excavator produces a TCAG that is in a machine-readable JSON format and is also humanunderstandable (visualized via a web-based interactive User Interface), to support varied analytical and decision making needs. We hope that Excavator will aid government agencies in efforts to understand likely downstream effects of political and economic decisions and events related to the pandemic, and respond in a timely manner to mitigate the impact of COVID-19. The benefit of Excavator is realized through a comprehensive visualization of events and how they affect each other. We expect the utility of Excavator to outlive the COVID-19 pandemic: analysts and decision makers will be empowered by Excavator to better understand and solve complex problems in the future. We first present our COVID-19 event taxonomy, and then we present details about event extraction, causal and temporal relation extraction, measuring event popularity using news text as "quantitative data", and the approach for constructing a TCAG. We then describe the system demonstration, present a quantitative analysis of the extractions, and conclude with recommended use cases. COVID-19 affects many aspects of our political, economic, and personal lives. A comprehensive analysis requires an event taxonomy that categorizes the events related to COVID-19 in many sectors and domains. We developed a COVID-19 event taxonomy using a hybrid approach of manual curation with automated support: first, we run Stanza (Qi et al., 2020 ) on a large sample (10%) of the Aylien coronavirus news dataset ( § 7) to tag verb and noun phrases that are likely to trigger events. Second, we represent each phrase as the average of the BERT (Devlin et al., 2019) contextualized embedding vectors of the subwords within each phrase, and then run committee-based clustering (Pantel and Lin, 2002) over the vector representations of the phrases to discover salient clusters. Finally, we review the frequently appearing clusters and define event types related to COVID-19. The event taxonomy includes 76 event types and a short description of each type. Figure 1 illustrates several branches of the event taxonomy (the complete taxonomy will be publicly available via github.com). The events come from a wide range of domains. We also manually added the hyponymy relation via is a links (e.g., COVID-19 is a {Virus, Disease}) between pairs of event types. We developed a neural network model for extracting events defined in the COVID-19 event taxonomy (the event classification stage) and extracting Figure (a) shows the architecture of the model, which takes a sequence of words x 1 , x 2 , ..., x n as input and outputs a sequence of tags y 1 , y 2 , ..., y n . Figure (b) and (c) shows an example for each of the two stages. "PolicyInt" is short for "PolicyIntervention". the location and time arguments (the event argument extraction stage), if they are mentioned in text, for each event mention. The structured representation (events with location and/or time) enables analyses of events targeting a specific time or location. Both stages use a BERT-based sequence tagging model. Figure 2 (a) shows the model architecture. Given a sequence of tokens as input, the model extracts a sequence of tags, one per each token. We use the commonly used Begin-Inside-Outside (BIO) tags for both event types and event argument role types for the event classification and argument attachment tasks respectively. Event classification: a sequence tagging model is trained to predict BIO tags of event types such that it identifies the event trigger span as well as the event type. Figure 2 (b) shows an example. Event argument extraction: similarly, another sequence tagging model is trained to predict BIO tags of argument role types, such that it identifies token spans of event arguments as well as their argument role types, with respect to a trigger has already been identified in the event classification stage and marked in the input sentence in "< t > ... < /t >". Figure 2 (c) shows an example. We run these two models in a pipeline: the event classification model is applied first to find event triggers and classify their types, then the event argument extraction model is applied to find location and time arguments for each event mention. Training data curation. We apply our prior work on rapid customization for event extraction (Chan et al., 2019) to curate a dataset for training the event classification model. Our developer spent about 13 minutes per event type to find, expand, and filter potential event triggers in a held-out 10% of the Aylien coronavirus news corpus. The statistics of the curated data set are shown in Table 1 (we only show the top-10 most frequent event types for brevity). In total, there are 11814 mentions in 7159 sentences. We plan to make this dataset available via github.com. To train the argument extraction model, we use the related event-argument annotation from the ACE 2005 dataset (Doddington et al., 2004) . We focus on location and time arguments 2 and ignore other roles. At decoding time, after extracting the argument mentions for events, we apply the AWAKE (Boschee et al., 2014) entity linking system to resolve each location argument to a canonical geolocation, and use SERIF (Boschee et al., 2005) to resolve each time argument to a canonical time and then convert it to the month level. This allows us to perform analyses of events targeting a specific geolocation or month of interest. We develop two approaches for extracting temporal and causal relations: a pattern-based approach and a neural network model. We take the union of the outputs from both approaches to maximize recall. The list of causal and temporal relations extracted by the systems is shown in Table 2 . Our extractors extract relations at the subtype level. However, we decided to merge the subtypes into types because (a) a user survey shows that users prefer to have a simplified definition of causality that only includes "event X causes (positively impacts) event Y" and "X mitigates (reduces/prevents) Y", because finergrained distinctions at sub-type level are difficult and less useful, and (b) merging the subtypes into types improves accuracy to near or above 0.8 as shown in Table 4 , comparing to 0.7 at the sub-type level due to extraction approaches struggling to differentiate between the sub-types. Pattern-based relation extraction. We applied the temporal and causal relation extraction patterns from LearnIt (Min et al., 2020) . A pattern is either a lexical pattern, which is a sequence of words between a pair of events, e.g.,"X leads to Y" 3 , or a proposition pattern, which is the (nested) predicateargument structure that connects the pair of events. For example, "verb:cause[subject=X] [object=Y]" is the proposition counterpart of the lexical pattern "X causes Y". Neural relation extraction. We developed a mention pooling (Baldini Soares et al., 2019) neural model for causal and temporal relation extraction. Figure 3 shows the model architecture. Taking a sentence in which a pair of event mention spans are marked as input, the model first encodes the sentence with BERT (Devlin et al., 2019) 4 . For each of the left and right event mentions, it then uses average pooling over the BERT contextualized vectors of the words in the span to obtain fixed-dimension vectors V 1 and V 2 as the span representations. It then concatenates the input embeddings V 1 and V 2 with the element-wise difference |V 1 − V 2 | to generate the pair representation V = (V 1 , V 2 , |V 1 − V 2 |). V is passed into a linear layer followed by a softmax layer to make the relation prediction. The model is trained with a blended dataset consisting of the Entities, Events, Simple and Complex Cause Assertion Annotation datasets 5 released by LDC 6 , and 1.5K temporal relation instances generated by applying the Lear-nIt temporal relation extraction patterns to 10,000 sampled Gigaword (Parker et al., 2011) articles. We aggregate all extracted events and causal and temporal relations across the corpus to construct a TCAG. The TCAG is visualized in the interactive visualization, in which each node is an event type and each edge is a causal or temporal relation 7 . We use a simple approach to aggregate events: by default, all event mentions sharing the same type are grouped into a single node named by the type; we resort to the UI to allow the user to selectively focus on a specific location and/or time, such that the UI will only show a TCAG involving event mentions and causal relations between pairs of events for the location and/or time of interest. The TCAG only provides a qualitative analysis of the temporal and causal relations between the COVID-related events. It will be more informative if we can measure the popularity of events through time to enable trend analysis (e.g., does lockdown go up or down between January and May, 2020?) and correlation analysis (e.g., will a stricter lockdown improve or deteriorate the economy?). In order to support these analyses, we produce a timeseries of a popularity score for each event type over time (a.k.a., event timeline). Extending our prior work (Min and Zhao, 2019) , we define the popularity score for event type e at time t as: 5 The catalog IDs of the LDC datasets are LDC2019E48, LDC2019E61, LDC2019E70, LDC2019E82, LDC2019E83. 6 www.ldc.upenn.edu 7 is a relations are also added as dashed edges in the TCAG. N e,t M t in which N e,t is the frequency of event e at month t. We calculate the moving average centered at each t with a sliding window of T = 3 months to reduce noise. M t is 1/500 of the total number of articles published in month t. The raw event frequency counts can be inflated due to the increasing level of media activity. Therefore, we divide the raw counts by M t to normalize the counts so that they are comparable across different months. Datasets. We run Excavator on the following two corpora to produce a TCAG for COVID-19: the first corpus is 1.2 million articles 8 from the Aylien Coronavirus News Dataset 9 , which contains 1.6 million COVID-related articles published between November 2019 and July 2020 that are from ∼440 news sources. We only kept the articles that are published between January and May 2020, since the corpus contains fewer articles in other months. The second corpus is the COVID-19 Open Research Dataset (Wang et al., 2020) . It contains coronavirus-related research from PubMed's PMC corpus, a corpus maintained by the WHO, and bioRxiv and medRxiv pre-prints. As of 11/08/2020, it contains over 300,000 scholarly articles. We combine these two corpora because news and research articles are complementary: news are rich in real-world events and are up to date, while analytical articles contain more causal relationships. Therefore, combining them is likely to lead to a more comprehensive analysis and new insights. Overall statistics of extractions. Excavator extracted 6.2 million event mentions of 59 types. Table 3 shows the event types that appear more than 50,000 times. We randomly sampled 100 event mentions, manually reviewed them, and found that the extracted events are 83% accurate. Excavator extracted 226,176 causal and temporal relations from the two corpora. A summary of the extracted relations and their precision 10 are shown in Table 4. TCAG Visualization We developed an interactive visualization of the TCAG. Figure 4 shows a small part of the TCAG centered on the event Lockdown. Each node represents an event type in our COVID event taxonomy for which Excavator is able to extract events and track their popularity scores ( § 6) through time. The three types of relational edges (Causes, Mitigates and Before) are shown in different colors. The size of the nodes and the thickness of the edges indicate the relative frequency of the event types or relations in the log scale, respectively. For example, Figure 4 shows that Death is mentioned more frequently than Lockdown, and the causal relation {Lockdown, Causes, EconomicCrisis} appears more frequently than {Lockdown, Mitigates ("reduces"), AccessToHealthcare}. To support analysis focusing on a single event, we color the focused event in blue, events that cause or precede the focused event in orange, and events that the focused event causes or precedes in green. Event popularity timeseries visualization For each node (event) in the TCAG visualization, we show its event popularity timeseries visualization on the side. Figure 5 shows 3 screenshots of the event popularity timeseries ( § 6) visualization between January and May 2020 for Lockdown, Eco-nomicCrisis and COVID-19 respectively. We describe 3 recommended use cases below. More details are in our demonstration video. Use case 1: causal and temporal analysis. We can get a panoramic view of the underlying casual and temporal dynamics between events related to COVID from the overall TCAG. We can start by analyzing the causal or temporal relations centered at an event of interest. For example, Figure 4 shows a diverse range of effects and consequences Figure 4 : A screenshot of a partial TCAG centered on Lockdown. Green, pink, and purple edges shows Cause, Mitigate and Before relations, respectively. Blue, orange and green nodes show the focused node and nodes with incoming and outgoing edges (with respect to the focused node), respectively. of Lockdown, such as EconomicCrisis (economic), Shortage (supply-chain), FearOrPanic (mental), etc. Interestingly, the graph also reveals surprises such as {Lockdown, Causes, Death}: the UI shows supporting evidence such as "lockdown exacerbates deaths and chronic health problems associated with poverty, ...". Furthermore, the TCAG shows that Lockdown mitigates DiseaseSpread but it also has a negative impact on the Economy, which will inform the decision makers that they will need to understand the economic trade-offs when implementing the Lockdown policy. We can also analyze longer-distance causal pathways consisting of two or more causal/temporal edges. For example, our demo video shows that COVID-19 causes or precedes (Before) Lockdown, and that Lockdown causes or precedes Economic-Crisis. This helps us understand details about how COVID causes EconomicCrisis. Use case 2: trend and correlation analysis. We can inspect the event timeline for a node or an edge to perform a trend analysis and a correlation analysis, respectively. Figure 5 shows screenshots of the event popularity timeseries between January and May 2020 for Lockdown, EconomicCrisis and COVID-19. First, the user can click on a single event to perform a trend analysis: the popularity of Lockdown goes up continuously, indicating an upward trend in implementing lockdown policies in more geographic regions. The user can also click on a edge to perform a correlation analysis for a pair of events: when the user clicks on the edge {Lockdown, Causes, EconomicCrisis}, the UI shows a strong correlation between the two upward curves. For another edge "Lockdown mitigates COVID-19", the UI shows a negative correlation near the end: as Lockdown rises, COVID-19 slightly falls towards the end. Use case 3: analyses targeted at geolocations. The event timeline visualization also allows the user to see the timeline for geolocations such as each U.S. state individually, instead of the aggregate for the entire U.S.. Figure 6 is a screenshot showing the 10 timelines for Lockdown for the top-10 most frequently mentioned U.S. states. The screenshot shows that the curves for California and New York go much higher than other states. This roughly matches the stricter lockdown policies implemented in the two states during this time period, comparing to other states. Such targeted analysis is made possible because our events have location and time arguments. We can also make the TCAG only show events and relations for a specific state, if a user selects a state of interest in the UI. Extracting events. Event extraction has been studied using feature-based approaches (Huang and Riloff, 2012; Ji and Grishman, 2008) , or neural networks (Chen et al., 2015; Nguyen et al., 2016a; Wadden et al., 2019; Liu et al., 2020) . GDELT (Leetaru and Schrodt, 2013) There are a lot of work in temporal (D'Souza and Ng, 2013; Chambers et al., 2014; Ning et al., 2018b; Meng and Rumshisky, 2018; Han et al., 2019; Vashishtha et al., 2020; Wright-Bettner et al., 2020) and causal (Bethard and Martin, 2008; Do et al., 2011; Riaz and Girju, 2013; Roemmele and Gordon, 2018; Hashimoto, 2019) relation extraction. Mirza and Tonelli (2016) and Ning et al. (2018a) extract both in a single framework. Constructing Causal Graphs from Text. Eidos (Sharp et al., 2019) uses a rule-based approach to extract causal relations to build a causal analysis graph, that has limited coverage on events related to COVID-19. LearnIt (Min et al., 2020) enables rapid customization of causal relation extractors. LearnIt does not focus on causal relations involving COVID-related events. This work also differs from these two in that we extract event arguments and temporal relations, and track event popularity. We present the Excavator system, a web-based TCAG visualization, and a video demonstration. Matching the blanks: Distributional similarity for relation learning Learning semantic links from a corpus of parallel temporal and causal relations Researching persons & organizations: Awake: From text to an entity-centric knowledge base Automatic information extraction Dense event ordering with a multi-pass architecture Rapid customization for event extraction Event extraction via dynamic multipooling convolutional neural networks BERT: Pre-training of deep bidirectional transformers for language understanding Minimally supervised event causality identification The automatic content extraction (ace) program-tasks, data, and evaluation Classifying temporal relations with rich linguistic knowledge Deep structured neural network for event temporal relation extraction Weakly supervised multilingual causality extraction from Wikipedia Modeling textual cohesion for event extraction Refining event extraction through cross-document inference Gdelt: Global data on events, location, and tone Event extraction as machine reading comprehension Contextaware neural model for temporal information extraction Learnit: On-demand rapid customization for eventevent relation extraction Measure countrylevel socio-economic indicators with streaming news: An empirical study Catena: Causal and temporal relation extraction from natural language texts Joint event extraction via recurrent neural networks A two-stage approach for extending event detection to new types via neural networks Joint reasoning for temporal and causal relations Improving temporal relation extraction with a globally acquired statistical resource Richer event description: Integrating event coreference with temporal, causal and bridging annotation Document clustering with committees Linguistic Data Consortium, Philadelphia Event detection and co-reference with minimal supervision Stanza: A Python natural language processing toolkit for many human languages Toward a better understanding of causality between verbal events: Extraction and analysis of the causal power of verbverb associations An encoder-decoder approach to predicting causal relations in stories From free text to executable causal models Temporal reasoning in natural language inference Entity, relation, and event extraction with contextualized span representations CORD-19: The COVID-19 open research dataset Defining and learning refined temporal relations in the clinical narrative This research is based upon work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via Contract No.: 2021-20102700002. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes not withstanding any copyright annotation therein.