N Overview of Event Extraction and Its Applications
N Overview of Event Extraction and Its Applications
Xiaohong Huang
School of Information Management and Engineering
Shanghai University of Finance and Economics
Shanghai 200433, China
huangxiaohong@163.sufe.edu.cn
November 8, 2021
A BSTRACT
With the rapid development of information technology, online platforms have produced enormous
text resources. As a particular form of Information Extraction (IE), Event Extraction (EE) has gained
increasing popularity due to its ability to automatically extract events from human language. However,
there are limited literature surveys on event extraction. Existing review works either spend much
effort describing the details of various approaches or focus on a particular field. This study provides
a comprehensive overview of the state-of-the-art event extraction methods and their applications
from text, including closed-domain and open-domain event extraction. A trait of this survey is that
it provides an overview in moderate complexity, avoiding involving too many details of particular
approaches. This study focuses on discussing the common characters, application fields, advantages,
and disadvantages of representative works, ignoring the specificities of individual approaches. Finally,
we summarize the common issues, current solutions, and future research directions. We hope this
work could help researchers and practitioners obtain a quick overview of recent event extraction.
Keywords Event extraction · information extraction · natural language processing (NLP) · text mining (TM) · survey
1 Introduction
With the rapid development of information technology, electronic textual data generated by the Internet provide a
resource of unbounded information-bearing potential. Over the years, Information Extraction (IE) has gained increasing
popularity because it helps exploit this potential by automatically extracting content from human language [1]. Event
Extraction (EE) originated in the late 1980s when the U.S. Defense Advanced Research Projects Agency (DARPA)
boosted research into message understanding [2]. Now event extraction has become an important and challenging task,
which aims to discover event triggers with specific types and their arguments [3].
Event extraction plays an important role in many applications in various fields. In the security field, Tanev et al. [4]
perform real-time news event extraction for global crisis monitoring. In the intelligent transportation field, Sakaki et
al. [5] develop a system that extracts real-time driving information using social media to provide important events for
drivers, such as traffic jams and weather reports. Sheng et al. [6] study the overlapping event extraction problem in
the financial field, and Zheng et al. [7] propose a novel end-to-end document-level event extraction framework from
Chinese financial announcements. In the social media field, Ritter et al. [8], Zhou et al. [9], Kunneman and Van Den
Bosch [10], and Peng et al. [11] develop novel open-domain event extraction models to extract events from Twitter. In
the biomedical field, there is much research that extracts medications and associated adverse drug events (ADEs) from
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
clinical documents [12, 13, 14]. In the legal field, extracting events from court decisions can provide a visual overview
of what happened throughout a case by representing the main legal events, together with relevant temporal information
[15]. Many studies have focused on proposing new approaches to tackle the challenges of general event extraction
[16, 17, 6, 18, 19].
According to different classified methods, the existing event extraction literature can be classified into different
categories. We summarize the typical research works and categorize them in Figure 1. Event extraction includes
closed-domain and open-domain event extraction, which are two mainstreaming parts. The former aims to discover
event triggers with specific types and their arguments, whereas the latter concentrates upon detecting new events or
tracking the change of state of a known event. This study mainly summarizes the literature from the technique view,
with other classification methods as supplementary.
Event
Corpus Sentence Level Document Level Cross Document
Extraction
Closed-domain event extraction. From the view of techniques used, existing approaches can be divided into four
categories: pattern matching, machine learning, deep learning, and semi-supervised learning methods. It is worth noting
that semi-supervised learning methods are separately treated as a single category because much research has recently
used semi-supervised or distant learning methods to enhance corpora and has become a research hot.
From the view of how to train a model, existing approaches can be categorized into pattern matching, pipelined training,
and joint training methods. Which manner is chosen mainly depends on how to treat the subtasks of event extraction by
researchers.
From the view of whether much expert knowledge is needed, existing approaches can be divided into knowledge-driven,
data-driven, and hybrid methods [20]. Knowledge-driven methods usually need expert knowledge to design delicate
patterns. Data-driven approaches mainly exploit knowledge from big data through statistics or deep learning methods.
The hybrid approaches combine the above mentioned methods.
Existing research can be divided into sentence level, document level, and cross-document level from the corpus level on
which the event extraction tasks are performed.
Open-domain event extraction. Open-domain event extraction is highly different from closed-domain event extrac-
tion because it focuses on detecting new or unexpected events from texts. So there are no predefined event types, and
event schema induction is a critical subtask of open-domain event extraction. From the view of technologies used,
existing approaches can be divided into Bayesian-based [21], clustering-based [11], parsing-based [8], lexicon-based
[22], semi-supervised [19] and distant supervision based [15], Adversarial Domain Adaptation based [23]. From the
view of the task target, the existing research can be categorized into new event detection, event generation, and event
tracking.
2
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Despite the importance and popularity of event extraction, there are limited comprehensive reviews and summaries
on the recent study of event extraction [20, 24, 25]. Most of the surveys research mainly focus on some specific field,
for example, deep learning schema-based event extraction [26], multilingual event extraction [27], event extraction
from social networks [28], biomolecular event extraction [29, 30], event extraction for decision support systems [2],
etc. Another limitation is that most existing surveys, including comprehensive reviews, lack a summary of recent
open-domain event extraction research. From this view, we review and provide an overview of recent event extraction
literature. Different from previous survey research, we summarize the contributions of this study as follows:
(1) We systematically review the literature of event extraction from the technique view, both closed-domain and
open-domain event extraction included. In each section, we review the models, techniques, event levels, datasets, and
application fields of the representative research and summarize them in a corresponding table by year.
(2) A trait of this survey is that we try to provide an overview in moderate complexity. We ignore the specificities of
individual research and avoid discussing the details of the individual research. We focus on discussing the common
characters, application fields, advantages, and disadvantages of representative works. We hope this work could help
researchers and practitioners obtain a quick outline of recent event extraction.
(3) We summarize the common issues and challenges that hinder event extraction generalization and industrial
applications. And currently corresponding solutions and research directions are also mentioned in the following.
The remainder of this paper is organized as follow. We first introduce event extraction task definition, commonly used
corpora, and evaluation metrics. Then review and summarize the literature in the technique view, with closed-domain
event extraction in section 3 and open-domain event extraction in section 4. Section 5 summarizes and discusses the
current common research issues and future directions. Conclusions are followed in section 6.
2 Event Extraction
As a particular form of information, event extraction involves named entity recognition (NER) and relation extraction
(RE), and mostly depends on the results of these tasks. As an interdisciplinary subject, event extraction is closely related
to computer science, statistics, and natural language processing. We demonstrate the relations from its fundamentals to
its applications in Figure 2.
Event Extraction
Information
Extraction Named Entity Recognition Relation Extraction
Interdisciplinary
Computer Science Statistics NLP
Subjects
Figure 2: Demonstration of the relationship between event extraction and other interdisciplinary subjects and techniques.
Following the event extraction task definition in ACE 2005, an event is frequently described as a change of state,
indicating a specific occurrence of something that happens in a particular time and a specific place involving one or
more participants. It can help answer the "5W1H" questions, i.e., "who", "when", "where", "what", "why" and "how"
about an event. ACE employs the following terminologies to describe an event extraction task:
3
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Event mention : An Event mention usually is a phrase or sentence that describes an event in which a trigger and
corresponding arguments are included.
Event trigger : It usually is a verb or a noun that most clearly expresses the core meaning of an event.
Event type : It refers to the category to which the event corresponds. In most cases, event types are predefined
manually, categorized by event triggers. For instance, there are eight event types and 33 subtypes predefined in the ACE
2005 event corpus. While in open-domain event extraction, it is not predefined explicitly but usually can be represented
by the event trigger.
Event argument : Event arguments are the main attributes of events. They are usually entity mentions describing the
event state change, involving who, what, when, where, and how.
Argument role : An argument role is a function or position that an event argument performs in the relationship
between the event argument and the trigger.
For example, there are two event types involved in sentence S1: "Die" and "Attack", triggered by "died" and "fired",
respectively. For Die event, "Baghdad", "cameraman", and "American tank" are its arguments with corresponding
roles: Place, Victim, and Instrument, respectively. For Attach event, "Baghdad", "cameraman", "American tank" and
"Palestine Hotel" are its arguments with corresponding role: Place, Victim, Instrument and Target, respectively. This is
a somewhat more complex example with three arguments shared, which is more challenging than the simple case with
one event type in one sentence. Figure 3 shows the event extraction annotation and the syntactic parser results.
• S1: In Baghdad, a cameraman died when an American tank fired on the Palestine Hotel.
Place Target
Target
Place Instrument
Victim
In Baghdad , a cameraman died when an American tank fired on the Palestine Hotel .
amod nsubj
det det compound
det nsubj advmod
nmod:in advcl nmod:on
Die Attack
Figure 3: An example of two events in one sentence: Die and Attack. The upper arcs link event triggers to their
corresponding arguments, with the argument roles on the arcs. The lower side demonstrates the syntactic parser results.
The closed-domain event extraction task can be divided into four subtasks: trigger identification, event type classification,
argument identification, and argument role classification. From the manner of how to organize the subtasks of the
event extraction, most of the existing closed-domain event extraction methods can be divided into two mainstreaming
categories: pipelined-based method and joint-based method. The pipeline-based method utilizes the idea of Divide-and-
Conquer algorithms; thus, the advantage is that it simplifies each subtask and can afford information for subsequent
subtasks. In contrast, the disadvantages are that it propagates cascading errors, and the overall performance dramatically
relies on the previous subtasks. The joint-based method considers the subtasks independently, thus does not propagate
errors among the subtasks. Accordingly, the disadvantages are that it can not utilize previous subtasks’ information and
needs more large-scale delicate labeled data to train the models.
Event Extraction corpora are annotated by professionals or experts with domain knowledge and used to train or evaluate
models. This section mainly introduces some representative event extraction corpora afforded by public evaluation
programs or mentioned in previous literature. We summarize these popular corpora in Table 1.
• The ACE 2005 event corpus contains eight event types and 33 subtypes, with about 6000 labeled examples in
599 documents (633 Chinese documents). Events in the ACE 2005 corpus are represented in terms of their
4
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
attributes and their participants. The participants are the ACE entities that participate in the event. ACE events
are, in essence, a generalization of ACE relations [1].
• The texts in the TimeBANK corpus [31] cover various media sources of the news domain. It was a gold-
standard human-annotated corpus marked up for temporal expressions, events, and temporal relations holding
between events and events times, following the TimeML (Time Markup Language) annotation scheme. The
TimeBank 1.2 [32] contains 183 articles with 27592 TimeML tags, among which 7935 are event tags.
• The Factbank corpus [33] is built on TimeBank 1.2 and part of the AQUAINT TimeML corpus. The difference
is that the Factbank corpus is supplemented with additional information concerning the factuality of events. It
consists of 208 documents and contains a total of 9488 manually annotated events.
• The GENIA corpus [34] is a semantically annotated corpus of biological literature. The GENIA corpus 3.0
consists of 1999 abstracts taken from the MEDLINE database. The current GENIA corpus event annotation
covers 1,000 of the 1999 abstracts of the primary GENIA corpus, marking 36114 events in 9372 sentences. The
more detailed event annotation information can be found at http://www.geniaproject.org/genia-corpus/event-
corpus.
• The TDT corpora 1 [35, 36] are used for Topic Detection and Tracking research programs, including TDT
Pilot, TDT2, and TDT3 corpus. The TDT Pilot corpus contains approximately 16,000 stories and 25 events.
The TDT2 corpus contains over 74,000 stories with more than 100 topics. The TDT3 corpus 2.0 contains over
31200 English stories and 12800 Chinese stories. They are usually used in open-domain event extraction tasks:
to detect the occurrence of new events (detection) and track the reoccurrence of old events (tracking).
• The GNBusiness dataset is a large-scale dataset annotated with diverse event types and explainable event
schemas, released along with the ODEE (Open Domain Event Extraction) algorithm [37]. It contains 55618
business news reports with 13047 news clusters in 288 batches from Oct. 17, 2018, to Jan. 22, 2019, among
which 680 clusters are annotated.
• The ASTRE corpus [38], dedicated to the evaluation of event schema induction, contains 1038 documents.
There are 100 documents selected from Wikinews inside the category Laws & Justice. The rest documents
are retrieved by the Google search engine, similar to the 100 initial seed documents. Only the Wikinews
documents are manually annotated to evaluate model performance, while the others are left for unsupervised
learning.
• The Patch Hate Crimes corpus [39] includes hyper-local news articles from 1217 cities based in the USA,
scraped from the Patch 2 website in the "Fire and Crime" category. A subset of 11130 articles are annotated
with a binary label (whether the article represents a specific hate crime) and attributes (including eight targets
and four types of hate crime actions).
• The CEC corpus 3 (Chinese Emergency corpus) collects breaking news events reported in Chinese, with a total
of 332 documents. It consists of five event types: earthquake, fire, traffic accident, terrorist attack, and food
poisoning.
1
https://catalog.ldc.upenn.edu/byproject#TDT-corpora
2
https://www.patch.com
3
https://github.com/shijiebei2009/CEC-Corpus
5
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
• The DuEE 2020 corpus 4 is released by Baidu and adopted in Language and Intelligent Technology Competition
2020. The corpus is selected and determined according to the hot search board of Baidu. It consists of 17,000
sentences containing 20,000 events of 65 event types.
In summary, although there have been various annotated event extraction corpora, including closed-domain and open-
domain corpora, there are still many limitations. Firstly, from the domain view, most of the existing corpora are
developed for closed-domain tasks with limited event types. Secondly, from a corpus size view, most corpora are
small and sparse because the annotation is such a cost-prohibitive process. Thirdly, from a utility view, the reusability
of existing corpora greatly depends on the targeted domains. Lastly, there is still a lack of generally acknowledged
large-scale corpus for open-domain event extraction.
The event extraction task, especially the closed-domain event extraction task, can be regarded as a classification task or
Sequence Labeling task. Most existing literature uses classification metrics to evaluate the event extraction performance.
In accordance with IE and TM, performance is generally measured by calculating the quantity of true positives and
negatives, as well as that of false positives and negatives. The most used metrics, e.g., precision, recall, and F1 score,
are calculated as follows:
TP
P recision = (1)
TP + FP
TP
Recall = (2)
TP + FN
2 ∗ P recision ∗ Recall 2 ∗ TP
F1 = = (3)
P recision + Recall 2 ∗ TP + FP + FN
These performance measures provide a brief explanation of the "Confusion Metrics". True positives (TP) and true
negatives (TN) are the observations that are correctly predicted. In contrast, false positives (FP) and false negatives
(FN) are the values that the actual class contradicts with the predicted class.
Open-domain event extraction aims to detect the unreported events or track the progress of the previously spotted
events. In most cases, it has no predefined schemas and event types. But with the help of annotated corpus, it still can
be transferred into a classification problem and thus uses the mentioned evaluation metrics. Many works conduct the
open-domain event extraction by clustering algorithms, and therefore, some clustering evaluation metrics like mutual
information or Chi-Square are often employed. For example, normalized pointwise mutual information (nPMI) can be
used to measure the slot coherence [37]:
f (x,y)
log f (x)∗f (y)/W
nP M I(x, y) = 1 (4)
log f (x,y)/W
where W is the total number of words in the corpus; f (x) and f (y) are frequencies of x and y in the corpus; f (x, y) is
the occurrence frequency of word pair (x, y) in the corpus. There are also other variants, e.g., cP M I (Corpus Level
Significant PMI) and P M I 2 used in the literature [40, 41].
This section categorizes closed-domain event extraction approaches into pattern matching, machine learning, deep
learning, and semi-supervised learning methods. The categorical arrangement also considers and follows the time when
the technique became a popular mainstream. We focus on providing an overview of closed-domain event extraction by
concentrating on the most common characters, including the main idea, common framework, applicated area, advantage,
and disadvantage. Many peculiarities of individual approaches are not considered in this study.
4
http://lic2020.cipsc.org.cn/
6
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
One character of pattern matching based methods is that they depend on domain-specific event templates, which require
a great deal of manual knowledge engineering to construct elaborately designed features. The earliest event extraction
methods were mainly based on syntax trees or regular expressions.
The typical representative work might be the AutoSlog system, developed by Ellen in 1993 [42]. It first defines
13 linguistic patterns with the help of a conceptual sentence analyzer. These linguistic patterns are used to build a
domain-specific dictionary of concepts automatically. Then the AutoSlog uses the trigger word dictionary to detect a
potential event. Lastly, it associates the event patterns and linguistic features, e.g., part-of-speech tags (POS) generated
by the sentence parser, to assemble the argument and its corresponding role. We summarize this typical process in
Figure 4.
Event Event
Annotated Patterns Argument & Role
Corpus
Due to its outstanding performance in specific domains, the research of pattern matching based event extraction has
exploded in various fields, such as the biomedical [43, 44, 16], general information extraction [45, 46], finance and
economics [47], etc. Akane et al. [43] design a program to extract events from biomedical papers using a full parser.
Halil et al. [44] use syntactic dependency and rules to perform biological event extraction. Ekaterina et al. [16]
incorporate manually curated dictionaries and machine learning methodologies to extract event triggers and arguments
on trimmed dependency graph structures. Roman et al. [45] propose an automatic event pattern discovery approach,
which can identify a set of relevant documents and a set of event patterns from un-annotated text, starting from a small
set of "seed scenario patterns". Chang et al. [46] propose a method that can effectively summarize the Chinese e-news
by four main components: Chinese POS tagger, Chinese term filter, Event Ontology Filter, and Summarization Agent.
Jethro et al. [47] propose the use of lexico-semantic patterns for financial event extraction from RSS news feeds.
The typical characters lie in two aspects: (1) utilizing lexical features, e,g., part-of-speech tags (POS), entity information,
and morphology features (token, lemma, etc.); (2) utilizing delicate event patterns normally designed by experts with
domain knowledge.
Several advantages of pattern-based approaches are summarized as follow. First, it needs less corpus than data-driven
methods. Second, it has better interpretability due to its patterns are manually designed and maintained. Third, it can
achieve high extraction accuracy in a specific domain once the patterns are well designed.
We summarize the disadvantages of pattern-based approaches from designing and generalizing views. First, developing
and maintaining the delicate event patterns is rather time-consuming and labor-intensive. Second, because pattern
designing is strongly dependent on the expression form of text, it needs much effort to transfer the patterns from one
domain to another. Low reusability of designed patterns or templates limits its generalization.
7
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
To alleviate the difficulty in designing delicate event patterns, many researchers have explored machine learning methods
to extract events. In this section, we first review the typical machine learning based event extraction literature and
summarize it in Table 2 from the view of the year, model, paradigm, technique, datasets used, event-level performed,
and application area. We also summarize and plot the typical abstract process in Figure 5. Then we focus on discussing
the common characters of typical research from feature engineering, the paradigm, technique, and application fields,
without considering spending much effort in describing the details of specific methods. We finally summarize the
advantages and disadvantages of machine learning based event extraction methods.
Feature Event
Dataset Classifiers
Engineering Assembling
Figure 5: Demonstration of an abstract process of machine learning based event extraction approaches. The lower side
demonstrates the usually executed steps in the corresponding phases.
The features reported in previous machine learning based event extraction methods can be categorized into lexical and
contextual features. Lexical features contain part-of-speech tags (POS), entity information, and morphology features
(e.g., token, lemma, etc.) [3]. Contextual features include local information (sentence level), global information
(document level), and external dictionaries. These features are complementary, and there have been various research
combining global evidence from related documents with local decisions [59, 60, 61]. For example, to overcome the
shortage of analyzing sentences in isolation, Huang and Riloff [51] present a bottom-up architecture to consider a view
of the larger context. It is implemented by integrating sequential sentence classifiers that capture textual cohesion,
including lexical associations and discourse relations across sentences. To resolve the ambiguities of sentence-level
event extraction relying on local information, Liao and Grishman [59] use document-level statistical information to
5
https://www.i2b2.org/NLP/DataSets/Main.php
6
http://ilk.uvt.nl/timbl/
7
http://users.umiacs.umd.edu/~hal/megam/version0_3/
8
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
improve sentence-level event extraction to achieve document level within-event and cross-event consistency. Patwardhan
and Riloff [60] combine phrasal and sentential evidence into a probabilistic framework to enhance accuracy. Hong et al.
[53] use blind cross-entity inference to improve sentence-level ACE event extraction by considering the consistency
and distribution of entities and roles.
Considering the complexity of the event extraction task, the foremost researchers divide the task into four subtasks:
event trigger identification, event type classification, argument detection, and role classification. There is much research
to train the classifiers in a pipelined manner, with the advantage that the previous classifier can provide information to
later classifiers [63, 62, 61, 4, 59, 37, 57, 56, 54, 53, 52, 5, 14, 17]. For example, Peng et al. [14] propose an automatic
pipeline to extract adverse drug events (ADE) by using Naïve Bayes and Support Vector Machine (SVM) to detect drug-
related tweets and sentiment analysis before mapping the biomedical text into drug events. However, the shortcoming
of pipelined training is also obvious: error propagation (cascading defects). To deal with this problem, researchers
adopt a joint training manner that treats the event extraction task as a multi-classification problem [60, 51, 50, 49, 48].
For example, Chen and Ng [49] employ joint learning for Chinese event extraction and investigate (1) various linguistic
features that exploit results of zero pronoun resolution and noun phrase coreference resolution, and (2) features that
exploit trigger probability and trigger type consistency.
From the technique view, the support vector machine (SVM), maximum entropy (ME), Naive Bayes (NB), conditional
random field (CRF), integer logic programming (ILP), Hierarchical agglomerative clustering (HAC) are the most used
machine learning algorithms. Lu and Roth [48] present a semi-Markov CRF approach for automatic event extraction and
further develop a novel learning approach called PM (structured preference modeling) that allows structured knowledge
to be incorporated effectively in a declarative manner. Björne and Salakoski [52] use SVMs to extract biomedical events
(detailed descriptions of biomolecular interactions) from research articles in a pipelined manner.
From the application field view, these machine learning based event extraction models evolve in many areas, including
general information extraction [63, 62, 61, 59, 53, 17, 51, 50, 49, 48], biomedical [58, 57, 56, 54, 52, 14], intelligent
transportation [5], security monitoring [4], etc. For example, Sakaki et al. [5] develop a system that extracts real-time
driving information using social media to offer important events to drivers, such as traffic jams and weather reports. It is
beneficial for areas where Intelligent Transportation System (ITS) deployment is poor. In the security field, Tanev et al.
[4] perform real-time news event extraction for global crisis monitoring. Many research efforts were centered around
BioNLP event extraction shared tasks, e.g., extracting protein interactions from text [57]. Li et al. [58] incorporate
three supervised machine learning models: CRF, AdaBoost, and SVM, to automatically extract medication events from
clinical text. Björne et al. [54] study the feasibility of performing event extraction at the PubMed scale. Miwa et al.
[56] construct a model for extracting complex biomolecular events, e.g., binding and regulation, using rich features.
Ananiadou et al. [55] give a review of the current event extraction methods for systems biology.
There is much research involving in the specific domain or improving the extraction accuracy. Henn et al. [17] perform
case studies on how visualization techniques enhance automated event extraction. Naughton et al. [62] merge and
extract events from heterogeneous news sources. There is also much research involving other language event extraction,
for example, Chinese event extraction [64, 65, 50]. Li et al. [50] employ joint learning for Chinese event extraction and
solve the high ratio of pseudo trigger mentions to true ones by using trigger filtering schemas.
We end this section by summarizing the advantages and disadvantages of machine learning based event extraction by
comparing it with pattern matching based methods. The benefits lie in two folds. Machine learning methods alleviate
much effort to design delicate patterns and have better generalization and reusability. The disadvantages lie in three
folds. First, supervised methods need more labeled data to train the model. Second, Feature engineering is a time-cost
but critical step that affects extraction accuracy. Third, traditional machine learning methods have limitations in learning
deep or complex nonlinear relations.
Feature engineering is the main challenging issue of traditional event extraction methods. And traditional machine
learning methods have limitations in learning deep or complex nonlinear relations. Deep learning based methods can
alleviate these shortages due to their two distinguishing characters. First, embedded representation of input is suitable
for big data. Second, specific deep architectures can better capture various more complex nonlinear features. This
section first reviews the recent deep learning based event extraction literature, then summarizes it in Table 3 from
the year, model, paradigm, technique, datasets used, event-level performed, and application area. Then we focus on
discussing the common characters of typical research from the view of feature, technique, and application field, without
much effort in describing the details of specific methods. We finally summarize the advantages and disadvantages of
deep learning based event extraction methods.
9
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Deep learning methods can learn distributed representation of knowledge, e.g., semantic features, avoiding feature
engineering. Word embedding, character embedding, position embedding, entity type embedding, POS tag embedding,
entity type embedding, word distance, relative position, path embedding, etc., are the most used features [68, 79, 78].
Except for the multi-channel distributed representation of the input, researchers have employed some techniques to
capture the features contained in these representations. For example, to better capture the complex relationships among
local and global contexts in biomedical documents, Zhao et al. [68] use a dependency-based GCN network to capture
the local context and a hypergraph to model the global context. In addition, the fine-grained interaction between the
local and global contexts is captured by a series of stacked Hypergraph Aggregation Neural Network (HANN) layers.
The overview of the proposed framework is shown in Figure 6.
Most recent event extraction studies are based on deep learning techniques, such as CNN [3, 79, 76], LSTM [12, 73],
Transformer [7, 66, 67], GCN [67, 68, 78], Bert [75, 6, 71], etc. There are also many hyhrid methods integrating the
mentioned architectures to obtain super performance [7, 12, 70]. We group the mentioned research by used techniques
and give short introductions of the typical works, respectively.
CNN based. CNN can capture local semantic features well in a sentence and overcome complex feature engineering
compared with traditional machine learning methods [3, 79]. However, CNN may miss valuable facts when considering
multiple-event sentences because it can not capture long-term information. Chen et al. [3] use a dynamic multi-pooling
convolutional neural network (DMCNN) to extract lexical-level and sentence-level features automatically. Björne and
8
http://nactem.ac.uk/MLEE/#availability
9
https://www.kaggle.com/nishanthsalian/genia-biomedical-event-dataset
10
https://emw.ku.edu.tr/clef-protestnews-2019/
11
http://ilk.uvt.nl/timbl/
12
https://tac.nist.gov/2019/SM-KBP/index.html
13
Crawling from http://www.cninfo.com.cn/new/index
10
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Salakoski [79] use a CNN to capture a unified linear sentence representation, including semantic embeddings, position
embeddings, and dependency path embeddings.
RNN & LSTM based. RNN and LSTM architectures are good at capturing long-term and shot-term memory
information, thus are suitable for sequence labeling and long dependency text. And event extraction can also be regarded
as a sequence labeling task. For example, Nguyen et al. [80] use two bidirectional RNNs to learn a richer representation
for the sentences. This representation is then utilized to predict event triggers and argument roles jointly. Wei et al.
[12] propose a Bi-LSTM-CRF-RNN-CNN approach to extract medications and associated adverse drug events (ADEs)
from clinical documents. Specifically, in the named entity recognition phase, the BI-LSTM layers calculate scores
of all possible labels for each token in a sequence. Then the CRF layer predicts a token’s label using its neighbor’s
information. In the relation classification phase, all possible candidate relation pairs are generated by a structure
that integrates CNN and RNN. To deal with the error propagation issue, Wei et al. [12] propose a joint method for
medication and adverse drug event extraction.
Attention & Transformer based. Attention mechanisms allow the deep learning models to learn the most important
information and ignore the noises by allocating different weights to different embeddings. According to the object
the attention mechanisms work on, there are word-level, sentence-level, document-level, and channel-level attentions.
The Transformer is a multi-head self-attention architecture in essence. Much attention-based or Transformer-based
event extraction research has emerged. For example, Zheng et al. [7] propose an end-to-end model, Doc2EDAG,
which can generate an entity-based directed acyclic graph to fulfill the document-level event extraction. The difference
between Doc2EDAG and the classic method, Bi-LSTM-CRF, is that Doc2EDAG employs the Transformer instead of
the original encoder, LSTM. The Transformer layers are used to encode a sequence of embeddings by the multiheaded
self-attention mechanism to exchange contextual information among the token sequence. Lu et al. [66] also propose
a sequence-to-structure generation paradigm that can directly extract events from the text in an end-to-end manner.
Compared with [7], a distinguishing difference is that [66] uses the event schemas as constraints to control the event
records generation.
GCN based. Multiple events existing in the same sentence, arguments of one event across more than one sentence, or
document-level event extraction are all facing one challenge: long-range dependencies. A common solution to leverage
dependency structures is using universal dependency parses. Syntactic Graph Convolution Networks (GCNs) with
nodes representing tokens and edges representing directed syntactic arcs are helping alleviate this challenge. To handle
the difficulty of multiple events existing in the same sentence, Liu et al. [78] propose a novel Jointly Multiple Events
Extraction (JMEE) framework to jointly extract multiple event triggers and arguments by introducing attention-based
GCN to model the dependency graph information. Ahmad et al. [67] use a Graph Attention Transformer Encoder
(GATE) to learn the long-range dependencies and apply it in cross-lingual relation ad event extraction.
Bert based. Pretrained semantic representations, such as EMLo, Bert, have been widely used in multiple NLP tasks
and have shown performance improvements in various NLP tasks. Bert is a bi-directional transformer architecture
model, which has been trained on massive corpora and has learned fairly good semantic representations conditioned on
token context and remains rich textual information [6]. Recently, much research has used Bert pre-trained representation
as shared textual input features. For example, Liu et al. [75] explicitly cast the event traction task as a machine reading
comprehension problem and use question-answering techniques to perform event extraction. Min et al. [71] propose an
event extraction framework, ExcavatorCovid, which extracts COVID- 19 related events and relations between them
from news and scientific publications. These events are used to build a Temporal and Causal Analysis Graph, which will
help the government sort out the information and adjust the related policies timely. The framework use Bert, Pooling,
and linear layers for extracting temporal and causal relations.
Other new methods. Except for the mentioned deep learning based models, new paradigms of event extraction
have emerged, such as question-answering based approaches [75]. For example, Liu et al. [75] explicitly cast the
event traction as a machine reading comprehension problem and use question-answering techniques to perform event
extraction. Many works are adopting strategies to improve extraction accuracy [74]. Many existing models seldomly
consider the relationships between the event mentions and the event arguments in different sentences. To handle this
challenge, Huang and Peng [74] propose a document-level event extraction framework, DEED, leveraging Deep Value
Networks (DVN) to capture the cross-event dependencies and coreference resolution.
From an application perspective, these deep learning based event extraction models involve in many areas, including
general information extraction [3, 80, 78, 77, 75, 74, 66], biomedical, [79, 12, 69, 68], financial [7, 6], multimedia [76],
legal [15], social [73, 71, 70], political [72], cross-lingual [67], etc.
11
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
We close this section by summarizing the advantages and disadvantages of deep learning based event extraction by
comparing it with traditional methods. Deep learning, essentially, is an extension and development of machine learning.
So it has the same pros and cons as machine learning. Here, we focus on summarizing the distinguishing strengths
and weaknesses. The benefits lie in three folds. First, deep learning methods have more powerful nonlinear expression
ability and can capture more complex relations between features, avoiding much feature engineering. Second, each
deep learning method has its specialty and strong point in capturing syntactic and semantic features. For example,
LSTM and Transformer architectures are all skilled in capturing long-range dependencies. Third, pre-trained models,
especially Bert, can ford excellent context information and have been widely used as standard input features. The
weaknesses of deep learning methods are as follow. First, due to the complex deep architectures, deep learning based
models mainly rely on huge labeled corpora to train the model. Second, numerous parameter settings may affect the
performance, such as the learning rate, training epochs, etc. However, many researchers have explored semi-supervised
and unsupervised learning methods to alleviate the difficulty in obtaining labeled corpus.
Most event extraction systems are trained with supervised learning and rely on a collection of annotated data. Due to
the domain-specificity of tasks, event extraction systems must be retrained with new massive annotated data for each
domain [81]. However, human-labeled training data is expensive to produce. Recently, some researchers have explored
new methods, such as semi-supervised and distant supervision methods, to automatically produce more training data.
Semi-Supervised methods. Semi-supervised learning (SSL) has attracted considerable attention to help achieve
strong generalization by making use of both unlabeled data and labeled data [13, 82, 83, 84, 85, 86, 87, 88, 89]. Much
research has used various SSL methods to help generate data or augment data for event extractions: role-identifying
nouns [81], linear discrimination analysis [86], Vector Quantized Variational Autoencoder [85], multi-modal Generative
Adversarial Network [89], etc.
Huang and Riloff [81] use role-identifying nouns to learn extraction patterns by a bootstrapping solution. Then the
role-identifying nouns and patterns are used to create training data for event extraction classifiers. Mansouri et al. [86]
first use a convolutional neural network to extract explicit features from text and images, then use linear discrimination
analysis (LDA) to predict the classes of unclassified data. Once the predicted accuracy is met, explicit features and
predicted labels will be used to finally predict whether a piece of news is fake or real. The labeled and unlabeled
instances are incorporated for training the semi-supervised learning model. Chen et al. [89] extend the multi-modal
Generative Adversarial Network (mmGAN) model to a semi-supervised architecture, which attempts to discriminate if
the data is real or generated and categorize it into one of the two classes: traffic event or non-traffic event. As shown in
Figure 7, the multi-modal feature learning architecture consists of three components: a Generator G, a Discriminator D,
and a Classifier C.
Discriminator D
Sensor Feature
2064,
1746, Multi-Modal
582, Sensor Data Feature
368, Encoder … Fully Connected
336, Layers
…
Z Generator G …
In,
Trafic, Fully Connected …
and, Social Data Layers
low, … Encoder …
gas,
…
Word
Embedding Multi-modal feature learning Classifier C
Figure 7: Multi-modal feature learning from both sensor time series and text embeddings.
Different from the mentioned methods focusing on data generation and data augmentation, Zhou et al. [88] design
a novel semi-supervised framework DualQA (dual question answering), to solve the event argument extraction in
low-resource scenarios.
12
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Distant Supervision Methods. Distant supervision is a successful paradigm that gathers training data for event
extraction systems by automatically aligning vast databases of facts with the text [90, 91, 92, 93, 94]. For example,
Reschke et al. [90] present a new publicly available dataset and use the distant supervision approach to plane crash
events. Yang et al. [91] first use Distance Supervision (DS) to automatically generate labeled data, then a sequence
tagging model to extract document-level events from the financial announcements. The data generation contains two
steps. First, the event trigger can be automatically marked by querying the pre-defined dictionary (financial event
knowledge base); thus, event mentions can be automatically identified, following the event trigger and the event
arguments labeled. Second, once the event mention is identified, it is labeled as a positive example; then, the rest of the
sentences in the announcement are marked as negative examples, which all constitute the document-level data. The deep
event extraction architecture has a Bi-LSTM-CRF module for sentence-level and a CNN module for document-level
event extraction. Zuo et al. [92] firstly design a Lexicon Enhanced Annotator (LexiAnno) to extract many causal
event pairs based on linguistic knowledge and employ them to label sentences via distant supervision automatically.
Experimental results show the proposed data augmentation framework outperforms other benchmark methods. To
solve data lack and imbalance in coverage of crisis types, Alrashdi and O’Keefe [93] utilize distant supervision to
automatically generate large-scale labeled tweet data for crisis response.
Every single event extraction approach has its own merits and demerits. Combining different techniques can help
integrate the advantages of multiple methods and significantly enhance the performance. There is an increasing number
of researchers that employ multiple approaches, i.e., hybrid models. We review the existing literature and discuss it in
two scenarios: single event extraction task and comprehensive system.
Integrating Different Paradigm. As discussed above, we have divided the research into four paradigms: pattern
matching methods, machine learning methods, deep learning methods, and data augmented methods. Many researchers
have considered more than one paradigm to enhance the accuracy of event extraction. For example, Reschke et al.
[90] extend the distant supervision approach to template-based event extraction and construct a new corpus, then use
the linear-chain CRF model to test the performance on this dataset. Yang et al. [91] use the pattern-based methods to
annotate the sentence-level and document-level corpus, then use the deep learning method to perform event extraction.
Integrating Different techniques. Because CRF and Bi-LSTM-CRF are widely used in different NER tasks, SVM
and RNN-CNN are widely used in relation-classification tasks. And RNN is good at capturing global features, whereas
CNN is good at capturing local features. Wei et al. [12] propose a Bi-LSTM-CRF-RNN-CNN approach to extract
medications and associated adverse drug events (ADEs) from clinical documents. Li et al. [58] incorporate three
supervised machine learning models: CRF, AdaBoost, and SVM automatically to extract medication events from
clinical text. GCN is good at modeling the long dependency parse, and Transformer is good at capturing the most
important information. Ahmad et al. [67] propose a deep model, integrating GCN and Transformer, to generate
structured contextual representations based on the dependency parse results.
The pre-trained models, such as Bert, can well represent the contextual semantic information and have been used as the
standard input features. Then other deep learning architectures can be stacked based on this input layer, finetuned, and
trained to execute related tasks. Lybarger et al. [69] extract COVID-19 diagnoses and symptoms from clinical text. In
this work, Bert, Bi-LSTM, Attention are used to generate pan representation. Specifically, firstly, Bert is used to map
the input sentence into contextualized word embeddings. Then, these representations are feed to Bi-LSTM without
finetuning the Bert. Lastly, each span is represented as the attention-weighted sum of the Bi-LSTM hidden states.
In recent years, event-related comprehensive systems have emerged. The remarkable character is that these systems
extract multiple categorical information (e.g., entities, relations, and events), from multiple sources, multiple languages,
and heterogeneous data modalities (speeches, texts, images, and videos).
Li et al. [76] present a comprehensive, open-source multimedia knowledge extraction system (GAIA) and create a
coherent, structured knowledge base. This GAIA system enables the search of complex graph queries and retrieves
multimedia evidence, including text, images, and videos. Specifically, the authors extract coarse-grained events and
arguments using a Bi-LSTM-CRF model and a CNN-based model in the Text Knowledge Extraction (TKE) branch.
13
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
Wen et al. [18] also propose a comprehensive extraction system (RESIN) that can automatically construct temporal
event graphs. RESIN extends from sentence-level event extraction to cross-document cross-lingual cross-media event
extraction, coreference resolution, and temporal event tracking.
These event-related comprehensive systems have greatly enhanced the accuracy of information retrieval. The hybrid
method integrates the advantages of multiple techniques, multiple sources, multiple languages, and heterogeneous data
modalities, leading it to a mainstreaming paradigm in the future, especially in industrial applications.
Clustering-based. Social events are the unique aggregation of various semantics, and related events or evolutions
tend to be cohesive. Thus, density-based clustering algorithms can be used to detect new events and evolution discovery.
For example, for each event group, an event schema can also be constructed with a slot-value schema through Event
Schema Induction (ESI). Peng et al. [11] propose a streaming social event detection and evolution discovery framework.
Specifically, first, an event-based heterogeneous information network (HIN) and a novel Pairwise Popularity Graph
Convolutional Network (PP-GCN) are constructed. Then a parallel heterogeneous clustering algorithm (H-DBSCAN)
is proposed for streaming event detection and evolution discovery.
Parsing-based. Syntactic parsing results are widely used to enhance open-domain event extraction tasks. For example,
the verb tag helps detect the event trigger, whereas the noun tag helps filter the event arguments. And the syntactic
dependencies help catch the same event’s roles and arguments, which appear across multiple sentences. Ritter et al. [8]
present the first open-domain event extraction and categorization system (TwiCal) for Twitter. As is shown in Figure
8, the processing pipeline contains POS tag, temporal resolution, NER, event tagger, significance ranking, and event
14
https://github.com/mickeystroller/ETypeClus
15
http://data.gdeltproject.org/events/index.html
16
https://bitbucket.org/junaraki/coling2018-event
14
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
classification components. Shen et al. [99] present an open-domain event type induction framework (ETYPECLUS).
For this purpose, the framework first selects predicates and object heads, then disambiguates predicates, lastly induces
the <hpredicate sense, object head> pairs by embedding and clustering algorithms. Chau et al. [98] use syntactic
parsing, WordNet, and a word sense disambiguation tool to extract events from news headlines. Then the events are
used to feed to a deep neural network to predict the natural gas price.
Temporal
Resolution
Entity Event Phrase Date Type
S M T WT F S
…. Steve Jobs died 10/6/11 Death
…. Significace iPhone announcement 10/4/11 ProductLaunch
POS tag NER Calendar Entries
…. Ranking GOP debate 9/7/11 PoliticalEvent
Amanda
verdict 10/3/11 Trial
Knox
Event
Event Tagger
Classification
Figure 8: Processing pipeline for extracting events from Twitter (left). The right side is examples of events extracted by
TwiCal.
Lexicon-based. Many researchers have contributed lexicons of words or phrases to assist the sequential event
extraction tasks. For example, de Vroe et al. [22] present an open-domain, lexicon-based event extraction system
MONTEE that can distinguish different types of modality. It can tell a reported event had taken place, did not take
place, or is uncertain. This result is valuable for avoiding extracting unreal events. Arnulphy et al. [101] use patterns
and shallow parsing to automatically build a lexicon for nouns event extraction.
Semi-supervised & distant supervision based. Semi-supervised and distant supervision methods are able to generate
high-quality training data. Veyseh et al. [19] explore a novel method for open-domain event detection by finetuning the
pre-trained language model GPT-2 to automatically generate new training data. Particularly, a novel teacher-student
architecture is adopted to keep the original and generated data consistency. Dor et al. [102] use rules to automatically
extract weak labels for event mentions describing economic events. Araki and Mitamura [15] use distant supervision to
conduct open-domain event detection. The significant character is it can detect all kinds of events.
Bayesian-based. Most Bayesian-based open-domain event extraction models assume that a sentence or document
is a joint distribution over event types, slots, entities, and contextual features. For example, Wang et al. [21] propose
an open event extraction model (AEM) based on Bayesian and Generative Adversarial Nets. Specifically, a Dirichlet
prior and a generator are used to capture the patterns of latent events. In contrast, a discriminator is used to distinguish
documents reconstructed from the latent events and the original input documents. Unlike other GAN-based text
generation approaches that capture the generating text sequence, the generator in AEM learns the projection function
between an event distribution and the event-related word distributions; thus, it captures the event-related patterns. Zhou
et al. [9] propose a Bayesian model, called Latent Event Model (LEM), to extract a structured representation of events
from social media. The most striking characteristic of LEM is that it is a fully unsupervised approach, and no annotated
data is required. Reference [37] extracts event type, schema, and arguments using a neural latent variable network and
Bayesian inference model (ODEE) and gets better results than other base models.
Adversarial Domain Adaptation. The adversarial domain adaptation (ADA) framework is initially proposed by
Ganin and Lempitsky and has been widely used in multiple NLP tasks [105]. Naik and Rose [23] leverage the adversarial
domain adaptation (ADA) framework to identify event triggers. This framework treats the event trigger identification
task as a token classification problem. A representation learner is trained to generate token-level representations, which
are predictive for trigger identification but not for domain prediction, making it more domain-invariant. The obvious
advantage is that there is no need to annotate the target domain data.
Open Domain Event Text Generation. Automated Story Generation (ASG) has been a research problem of interest
and open-domain event extraction subtask. Fu et al. [96] perform an open-domain event text generation task with an
entity chain as its skeleton. To build this dataset, a wiki augmented generator framework containing an encoder, a
retriever, and a decoder is proposed. The encoder encodes the entity chain into hidden representations while the decoder
decodes from these hidden representations and generates related stories. The retriever is responsible for collecting
reliable information to enhance the readability of the generated text. Martin et al. [97] model the automated story
generation task as a sampling problem. It generates the following event by choosing the maximizing probability from
the event distribution.
15
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
We close this section by discussing the advantages and disadvantages of the mentioned works compared with the
closed-domain event extraction methods. Most open-domain event extraction works focus on detecting new events and
extracting related information. This information is beneficial for scenarios that require comprehensive knowledge of
broad-coverage, fine-grained, and dynamically evolving event categories, e.g., stock price prediction based on news.
However, from the mentioned literature review, we can find that the existing methods are mainly based on syntactic
parsing, clustering, Bayesian, lexicon, etc. The current methods’ output is still not as perfect as closed-domain event
extraction results in two aspects. First, since open-domain event extraction needs no predefined schemas, the extracted
results are omnifarious, increasing the difficulty of utilization. Second, since open-domain event extraction has no
predefined event types, some research uses the extracted event trigger to represent the event types. Although many
researchers have tried to induce these event types by clustering or latent event type inference, the results are not always
convenient or understandable. Due to the usefulness of dynamically evolving event categories, we believe that there
will be more research exploring new paradigms and techniques in open-domain event extraction.
5 Discussion
In this section, we summarize and discuss current common research issues in event extraction. Despite the considerable
progress in event extraction, there are still some challenges involved but not limited to the following aspects.
Datasets. Although there have been various annotated corpora and many researchers have explored some semi-
supervised methods to automatically label data, the data size and categories still look embarrassed compared with the
big data algorithm requirement. Another problem is category imbalance. For example, the existing corpus category
mainly focuses on natural disasters, social relationships, biomedicine, etc. And the size of some categories are small.
Even worse, there is no annotated corpus in some fields. More high-quality annotated data need more research, such as
semi-supervised or distant supervision methods.
Document-level and corpus-level event extraction. Most existing event extraction methods mainly extract event
arguments within the sentence scope [7, 61]. However, the extraction results are not ideal in the following two cases.
First, event arguments of the same event always scatter across different sentences. Another, multiple sentences or
documents characterize the same event. The former case leads the extraction results incomplete, while the latter case
leads the extraction results redundant. Document-level and corpus-level event extraction tasks face the following
challenges: long-term dependency and entity and event coreference. Researchers have started to settle this problem by
various mechanisms, such as end-to-end structured prediction [74], sequence-to-structure generation paradigm [66],
Open-schema event profiling [106], etc.
Cross-linguistic. Researchers have contributed relatively richer event extraction corpora in English, whereas fewer
corpora in other languages. Recently cross-lingual transfer learning approaches have been used for event extraction
[107, 67]. For example, Subburathinam et al. [107] use a GCN-based network to train an event extraction model from
source language annotations to the target language. However, GCN is not good at capturing long-range dependencies
or not directly connected relations in the dependency tree. Ahmad et al. [67] improve this work by using attention
mechanisms to learn the dependencies between words with different syntactic distances. Cross-linguistic event extraction
can save much effort in constructing corpora in other languages, and it is beneficial for low-source languages.
Event coreference. Usually, the same event frequently co-exists in multiple documents. For example, it is a common
case that different news media report the same hot news. Even document-level event extraction may not alleviate
the redundancy. Event coreference or event merge is crucial for information retrieval, especially in event-related
comprehensive systems which involve multiple sources, multiple languages, and heterogeneous data modalities
(speeches, texts, images, and videos).
Open-domain event extraction needs new schemas and techniques. The current research primarily focuses on
closed-domain event extraction due to its plentiful corpora, mature methods, and acknowledged evaluation mechanisms.
Despite the importance of open-domain event extraction, it has not received sufficient attention compared with closed-
domain event extraction research. We review the recent open-domain research and find that the performance has not
reached the desired level. Several challenges still hinder its generalization and industrial applications. First, there are
fewer large, high-quality, and acknowledged open-domain corpora. Second, a set of mature evaluation mechanisms to
evaluate open-domain extraction results need to be proposed. Third, open-domain event extraction needs to develop
new schemas and techniques to enhance the performance. We believe this is a promising research direction in future.
16
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
6 Conclusion
In this paper, we review and summarize the literature in event extraction from text. Overall, we focus on providing a
comprehensive overview of event extraction tasks, ignoring the peculiarities of individual approaches. Specifically,
we first introduce the related concepts of event extraction, such as EE catalog, task definition, corpora, evaluation
metrics. Then we summarize the literature from the technique view. In both closed-domain and open-domain event
extraction sections, we summarize the literature from the year, common framework, technique, corpus, application
field, advantage, and disadvantage. Last, we summarize and discuss current common issues and related progress in
closed-domain and open-domain event extraction.
Although there are still many challenges, event extraction, especially open-domain event extraction, is attracting more
and more attention due to its crucial role in information extraction. This research provides a way to quickly understand
up-to-date event extraction tasks from a moderately difficult perspective.
References
[1] George R. Doddington, Alexis Mitchell, Mark A. Przybocki, Lance A. Ramshaw, Stephanie M. Strassel, and
Ralph M. Weischedel. The automatic content extraction (ace) program-tasks, data, and evaluation. In Lrec,
volume 2 of Lrec, pages 837–840. Lisbon, 2004.
[2] Frederik Hogenboom, Flavius Frasincar, Uzay Kaymak, Franciska De Jong, and Emiel Caron. A survey of event
extraction methods from text for decision support systems. Decision Support Systems, 85:12–22, 2016.
[3] Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. Event extraction via dynamic multi-pooling
convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long
Papers), Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th
International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 167–176, 2015.
[4] Hristo Tanev, Jakub Piskorski, and Martin Atkinson. Real-time news event extraction for global crisis moni-
toring. In International Conference on Application of Natural Language to Information Systems, International
Conference on Application of Natural Language to Information Systems, pages 207–218. Springer, 2008.
[5] Takeshi Sakaki, Yutaka Matsuo, Tadashi Yanagihara, Naiwala P. Chandrasiri, and Kazunari Nawa. Real-time
event extraction for driving information from social sensors. In 2012 IEEE International Conference on Cyber
Technology in Automation, Control, and Intelligent Systems (CYBER), 2012 IEEE International Conference on
Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pages 221–226. IEEE, 2012.
[6] Jiawei Sheng, Shu Guo, Bowen Yu, Qian Li, Yiming Hei, Lihong Wang, Tingwen Liu, and Hongbo Xu. Casee: A
joint learning framework with cascade decoding for overlapping event extraction. In Findings of the Association
for Computational Linguistics: ACL-IJCNLP 2021, Findings of the Association for Computational Linguistics:
ACL-IJCNLP 2021, pages 164–174, 2021.
[7] Shun Zheng, Wei Cao, Wei Xu, and Jiang Bian. Doc2edag: An end-to-end document-level framework for
chinese financial event extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural
Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-
IJCNLP), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the
9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 337–346, 2019.
[8] Alan Ritter, Oren Etzioni, and Sam Clark. Open domain event extraction from twitter. In Proceedings of the
18th ACM SIGKDD international conference on Knowledge discovery and data mining, Proceedings of the 18th
ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1104–1112, 2012.
[9] Deyu Zhou, Liang-Yu Chen, and Yulan He. A simple bayesian modelling approach to event extraction from
twitter. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume
2: Short Papers), Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), pages 700–705, 2014.
[10] Florian Kunneman and Antal Van Den Bosch. Open-domain extraction of future events from twitter. Natural
Language Engineering, 22(5):655–686, 2016.
[11] Hao Peng, Jianxin Li, Yangqiu Song, Renyu Yang, Rajiv Ranjan, Philip S. Yu, and Lifang He. Streaming
social event detection and evolution discovery in heterogeneous information networks. ACM Transactions on
Knowledge Discovery from Data (TKDD), 15(5):1–33, 2021.
17
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[12] Qiang Wei, Zongcheng Ji, Zhiheng Li, Jingcheng Du, Jingqi Wang, Jun Xu, Yang Xiang, Firat Tiryaki, Stephen
Wu, and Yaoyun Zhang. A study of deep learning approaches for medication and adverse drug event extraction
from clinical text. Journal of the American Medical Informatics Association, 27(1):13–21, 2020.
[13] Jing Liu, Songzheng Zhao, and Gang Wang. Ssel-ade: a semi-supervised ensemble learning framework for
extracting adverse drug events from social media. Artificial intelligence in medicine, 84:34–49, 2018.
[14] Yang Peng, Melody Moh, and Teng-Sheng Moh. Efficient adverse drug event extraction using twitter sentiment
analysis. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
(ASONAM), 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
(ASONAM), pages 1011–1018. IEEE, 2016.
[15] Erwin Filtz, María Navas-Loro, Cristiana Santos, Axel Polleres, and Sabrina Kirrane. Events matter: Extraction
of events from court decisions. In Legal Knowledge and Information Systems, Legal Knowledge and Information
Systems, pages 33–42. IOS Press, 2020.
[16] Ekaterina Buyko, Erik Faessler, Joachim Wermter, and Udo Hahn. Event extraction from trimmed dependency
graphs. In Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, Proceedings of the
BioNLP 2009 Workshop Companion Volume for Shared Task, pages 19–27, 2009.
[17] Sophia Henn, Abigail Sticha, Timothy Burley, Ernesto Verdeja, and Paul Brenner. Visualization techniques to
enhance automated event extraction. arXiv preprint arXiv:2106.06588, 2021.
[18] Haoyang Wen, Ying Lin, Tuan Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, and
Hongming Zhang. Resin: A dockerized schema-guided cross-document cross-lingual cross-media information
extraction and event tracking system. In Proceedings of the 2021 Conference of the North American Chapter of
the Association for Computational Linguistics: Human Language Technologies: Demonstrations, Proceedings of
the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies: Demonstrations, pages 133–143, 2021.
[19] Amir Pouran Ben Veyseh, Minh Van Nguyen, Bonan Min, and Thien Huu Nguyen. Augmenting open-domain
event detection with synthetic data from gpt-2. In Joint European Conference on Machine Learning and
Knowledge Discovery in Databases, Joint European Conference on Machine Learning and Knowledge Discovery
in Databases, pages 644–660. Springer, 2021.
[20] Frederik Hogenboom, Flavius Frasincar, Uzay Kaymak, and Franciska De Jong. An overview of event extraction
from text. In DeRiVE@ ISWC, DeRiVE@ ISWC, pages 48–57. Citeseer, 2011.
[21] Rui Wang, Deyu Zhou, and Yulan He. Open event extraction from online text using a generative adversarial
network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the
9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Proceedings of the 2019
Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on
Natural Language Processing (EMNLP-IJCNLP), pages 282–291, 2019.
[22] Sander Bijl de Vroe, Liane Guillou, Miloš Stanojević, Nick McKenna, and Mark Steedman. Modality and
negation in event extraction. In Proceedings of the 4th Workshop on Challenges and Applications of Automated
Extraction of Socio-political Events from Text (CASE 2021), Proceedings of the 4th Workshop on Challenges and
Applications of Automated Extraction of Socio-political Events from Text (CASE 2021), pages 31–42, 2021.
[23] Aakanksha Naik and Carolyn Rose. Towards open domain event trigger identification using adversarial domain
adaptation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7618–7624,
2020.
[24] Liying Zhan and Xuping Jiang. Survey on event extraction technology in information extraction research area.
In 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC),
2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC),
pages 2121–2126. IEEE, 2019.
[25] Wei Xiang and Bang Wang. A survey of event extraction from text. IEEE Access, 7:173111–173137, 2019.
[26] Qian Li, Hao Peng, Jianxin Li, Yiming Hei, Rui Sun, Jiawei Sheng, Shu Guo, Lihong Wang, and Philip S.
Yu. Deep learning schema-based event extraction: Literature review and current trends. arXiv preprint
arXiv:2107.02126, 2021.
[27] Vera Danilova, Mikhail Alexandrov, and Xavier Blanco. A survey of multilingual event extraction from text. In
International Conference on Applications of Natural Language to Data Bases/Information Systems, International
Conference on Applications of Natural Language to Data Bases/Information Systems, pages 85–88. Springer,
2014.
18
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[28] Mohamed Mejri and Jalel Akaichi. A survey of textual event extraction from social networks. In LPKM, LPKM,
2017.
[29] Jorge A. Vanegas, Sérgio Matos, Fabio González, and José L. Oliveira. An overview of biomolecular event
extraction from scientific documents. Computational and mathematical methods in medicine, 2015, 2015.
[30] Elham Shahab. A short survey of biomedical relation extraction techniques. arXiv preprint arXiv:1707.05850,
2017.
[31] James Pustejovsky, Patrick Hanks, Roser Sauri, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir
Radev, Beth Sundheim, David Day, and Lisa Ferro. The timebank corpus. In Corpus linguistics, volume 2003 of
Corpus linguistics, page 40. Lancaster, UK., 2003.
[32] James Pustejovsky, Jessica Littman, Roser Saurí, and Marc Verhagen. Timebank 1.2 documentation. Event
London, no. April, pages 6–11, 2006.
[33] Roser Saurí and James Pustejovsky. Factbank: a corpus annotated with event factuality. Language resources and
evaluation, 43(3):227–268, 2009.
[34] J-D Kim, Tomoko Ohta, Yuka Tateisi, and Jun Ichi Tsujii. Genia corpus—a semantically annotated corpus for
bio-textmining. Bioinformatics, 19(suppl_1):i180–i182, 2003.
[35] Ron Papka and James Allan. Topic detection and tracking: Event clustering as a basis for first story detection. In
Advances in Information Retrieval, Advances in Information Retrieval, pages 97–126. Springer, 2002.
[36] James Allan. Topic detection and tracking: event-based information organization, volume 12. Springer Science
and Business Media, 2012.
[37] Xiao Liu, He-Yan Huang, and Yue Zhang. Open domain event extraction using neural latent variable models. In
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the
57th Annual Meeting of the Association for Computational Linguistics, pages 2860–2871, 2019.
[38] Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret, and Romaric Besançon. A dataset for open event extraction in
english. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16),
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages
1939–1943, 2016.
[39] Aida Mostafazadeh Davani, Leigh Yeh, Mohammad Atari, Brendan Kennedy, Gwenyth Portillo-Wightman,
Elaine Gonzalez, Natalie Delong, Rhea Bhatia, Arineh Mirinjian, and Xiang Ren. Reporting the unreported:
Event extraction for analyzing the local representation of hate crimes. arXiv preprint arXiv:1909.02126, 2019.
[40] Om P. Damani. Improving pointwise mutual information (pmi) by incorporating significant co-occurrence. In
Proceedings of the Seventeenth Conference on Computational Natural Language Learning, Proceedings of the
Seventeenth Conference on Computational Natural Language Learning, pages 20–28, 2013.
[41] Pavel Pecina. An extensive empirical study of collocation extraction methods. In Proceedings of the ACL Student
Research Workshop, Proceedings of the ACL Student Research Workshop, pages 13–18, 2005.
[42] Ellen Riloff. Automatically constructing a dictionary for information extraction tasks. In AAAI, volume 1 of
AAAI, page 2.1. Citeseer, 1993.
[43] Akane Yakushiji, Yuka Tateisi, Yusuke Miyao, and Jun-ichi Tsujii. Event extraction from biomedical papers
using a full parser. In Biocomputing 2001, Biocomputing 2001, pages 408–419. World Scientific, 2000.
[44] Halil Kilicoglu and Sabine Bergler. Syntactic dependency based heuristics for biological event extraction. In
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, Proceedings of the BioNLP
2009 Workshop Companion Volume for Shared Task, pages 119–127, 2009.
[45] Roman Yangarber, Ralph Grishman, Pasi Tapanainen, and Silja Huttunen. Automatic acquisition of domain
knowledge for information extraction. In COLING 2000 Volume 2: The 18th International Conference on
Computational Linguistics, COLING 2000 Volume 2: The 18th International Conference on Computational
Linguistics, 2000.
[46] Chang-Shing Lee, Yea-Juan Chen, and Zhi-Wei Jian. Ontology-based fuzzy event extraction agent for chinese
e-news summarization. Expert Systems with Applications, 25(3):431–447, 2003.
[47] Jethro Borsje, Frederik Hogenboom, and Flavius Frasincar. Semi-automatic financial events discovery based on
lexico-semantic patterns. International Journal of Web Engineering and Technology, 6(2):115–140, 2010.
[48] Wei Lu and Dan Roth. Automatic event extraction with structured preference modeling. In Proceedings of the
50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Proceedings
of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages
835–844, 2012.
19
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[49] Chen Chen and Vincent Ng. Joint modeling for chinese event extraction with rich linguistic features. In
Proceedings of COLING 2012, Proceedings of COLING 2012, pages 529–544, 2012.
[50] Peifeng Li, Qiaoming Zhu, Hongjun Diao, and Guodong Zhou. Joint modeling of trigger identification and event
type determination in chinese event extraction. In Proceedings of COLING 2012, Proceedings of COLING 2012,
pages 1635–1652, 2012.
[51] Ruihong Huang and Ellen Riloff. Modeling textual cohesion for event extraction. In Proceedings of the AAAI
Conference on Artificial Intelligence, volume 26 of Proceedings of the AAAI Conference on Artificial Intelligence,
2012.
[52] Jari Björne and Tapio Salakoski. Generalizing biomedical event extraction. In Proceedings of BioNLP Shared
Task 2011 Workshop, Proceedings of BioNLP Shared Task 2011 Workshop, pages 183–191, 2011.
[53] Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. Using cross-entity inference
to improve event extraction. In Proceedings of the 49th annual meeting of the association for computational
linguistics: human language technologies, Proceedings of the 49th annual meeting of the association for
computational linguistics: human language technologies, pages 1127–1136, 2011.
[54] Jari Björne, Filip Ginter, Sampo Pyysalo, Jun’ichi Tsujii, and Tapio Salakoski. Complex event extraction at
pubmed scale. Bioinformatics, 26(12):i382–i390, 2010.
[55] Sophia Ananiadou, Sampo Pyysalo, Jun Ichi Tsujii, and Douglas B. Kell. Event extraction for systems biology
by text mining the literature. Trends in biotechnology, 28(7):381–390, 2010.
[56] Makoto Miwa, Rune Sætre, Jin-Dong Kim, and Jun’ichi Tsujii. Event extraction with complex event classification
using rich features. Journal of bioinformatics and computational biology, 8(01):131–146, 2010.
[57] Rune Sætre, Kazuhiro Yoshida, Makoto Miwa, Takuya Matsuzaki, Yoshinobu Kano, and Jun’ichi Tsujii.
Extracting protein interactions from text with the unified akanere event extraction system. IEEE/ACM transactions
on computational biology and bioinformatics, 7(3):442–453, 2010.
[58] Zuofeng Li, Feifan Liu, Lamont Antieau, Yonggang Cao, and Hong Yu. Lancet: a high precision medication event
extraction system for clinical text. Journal of the American Medical Informatics Association, 17(5):563–567,
2010.
[59] Shasha Liao and Ralph Grishman. Using document level cross-event inference to improve event extraction. In
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the
48th Annual Meeting of the Association for Computational Linguistics, pages 789–797, 2010.
[60] Siddharth Patwardhan and Ellen Riloff. A unified model of phrasal and sentential evidence for information
extraction. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing,
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 151–160,
2009.
[61] Heng Ji and Ralph Grishman. Refining event extraction through cross-document inference. In Proceedings of
ACL-08: Hlt, Proceedings of ACL-08: Hlt, pages 254–262, 2008.
[62] Martina Naughton, Nicholas Kushmerick, and Joseph Carthy. Event extraction from heterogeneous news sources.
In proceedings of the AAAI workshop event extraction and synthesis, proceedings of the AAAI workshop event
extraction and synthesis, pages 1–6, 2006.
[63] David Ahn. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about
Time and Events, Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pages 1–8,
2006.
[64] Yanyan Zhao, Bing Qin, Wan-xiang Che, and Ting Liu. Research on chinese event extraction. Journal of Chinese
Information Processing, 22(1):3–8, 2008.
[65] Zheng Chen and Heng Ji. Language specific issue and feature exploration in chinese event extraction. In
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter
of the Association for Computational Linguistics, Companion Volume: Short Papers, Proceedings of Human
Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for
Computational Linguistics, Companion Volume: Short Papers, pages 209–212, 2009.
[66] Yaojie Lu, Hongyu Lin, Jin Xu, Xianpei Han, Jialong Tang, Annan Li, Le Sun, Meng Liao, and Shaoyi Chen.
Text2event: Controllable sequence-to-structure generation for end-to-end event extraction. arXiv preprint
arXiv:2106.09232, 2021.
[67] Wasi Uddin Ahmad, Nanyun Peng, and Kai-Wei Chang. Gate: Graph attention transformer encoder for cross-
lingual relation and event extraction. In The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21),
The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
20
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[68] Weizhong Zhao, Jinyong Zhang, Jincai Yang, Tingting He, Huifang Ma, and Zhixin Li. A novel joint biomedical
event extraction framework via two-level modeling of documents. Information Sciences, 550:27–40, 2021.
[69] Kevin Lybarger, Mari Ostendorf, Matthew Thompson, and Meliha Yetisgen. Extracting covid-19 diagnoses
and symptoms from clinical text: A new annotated corpus and neural event extraction framework. Journal of
Biomedical Informatics, 117:103761, 2021.
[70] Kevin Lybarger, Mari Ostendorf, and Meliha Yetisgen. Annotating social determinants of health using active
learning, and characterizing determinants using neural event extraction. Journal of Biomedical Informatics,
113:103631, 2021.
[71] Bonan Min, Benjamin Rozonoyer, Haoling Qiu, Alexander Zamanian, and Jessica MacBride. Excavatorcovid:
Extracting events and relations from text corpora for temporal and causal analysis for covid-19. arXiv preprint
arXiv:2105.01819, 2021.
[72] Tommaso Caselli, Osman Mutlu, Angelo Basile, and Ali Hürriyetoğlu. Protest-er: Retraining bert for protest
event extraction. In Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of
Socio-political Events from Text (CASE 2021), Proceedings of the 4th Workshop on Challenges and Applications
of Automated Extraction of Socio-political Events from Text (CASE 2021), pages 12–19, 2021.
[73] Qianren Mao, Xi Li, Hao Peng, Jianxin Li, Dongxiao He, Shu Guo, Min He, and Lihong Wang. Event prediction
based on evolutionary event ontology knowledge. Future Generation Computer Systems, 115:76–89, 2021.
[74] Kung-Hsiang Huang and Nanyun Peng. Efficient end-to-end learning of cross-event dependencies for document-
level event extraction. arXiv preprint arXiv:2010.12787, 2020.
[75] Jian Liu, Yubo Chen, Kang Liu, Wei Bi, and Xiaojiang Liu. Event extraction as machine reading comprehension.
In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP),
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages
1641–1651, 2020.
[76] Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu
Chang, and Clare Voss. Gaia: A fine-grained multimedia knowledge extraction system. In Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Proceedings
of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages
77–86, 2020.
[77] Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, and Dongsheng Li. Exploring pre-trained language models for
event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational
Linguistics, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages
5284–5294, 2019.
[78] Xiao Liu, Zhunchen Luo, and He-Yan Huang. Jointly multiple events extraction via attention-based graph
information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language
Processing, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages
1247–1256, 2018.
[79] Jari Björne and Tapio Salakoski. Biomedical event extraction using convolutional neural networks and dependency
parsing. In Proceedings of the BioNLP 2018 workshop, Proceedings of the BioNLP 2018 workshop, pages
98–108, 2018.
[80] Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. Joint event extraction via recurrent neural networks.
In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Proceedings of the 2016 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, pages 300–309, 2016.
[81] Ruihong Huang and Ellen Riloff. Bootstrapped training of event extraction classifiers. In Proceedings of the 13th
Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the 13th
Conference of the European Chapter of the Association for Computational Linguistics, pages 286–295, 2012.
[82] Deyu Zhou and Dayou Zhong. A semi-supervised learning framework for biomedical event extraction based on
hidden topics. Artificial intelligence in medicine, 64(1):51–58, 2015.
[83] James Ferguson, Colin Lockard, Daniel S. Weld, and Hannaneh Hajishirzi. Semi-supervised event extraction with
paraphrase clusters. In Proceedings of the 2018 Conference of the North American Chapter of the Association
for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Proceedings of the
2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 2 (Short Papers), pages 359–364, 2018.
21
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[84] Shashank Gupta, Sachin Pawar, Nitin Ramrakhiyani, Girish Keshav Palshikar, and Vasudeva Varma. Semi-
supervised recurrent neural network for adverse drug reaction mention extraction. BMC bioinformatics, 19(8):1–7,
2018.
[85] Lifu Huang and Heng Ji. Semi-supervised new event type induction and event detection. In Proceedings of the
2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 718–724, 2020.
[86] Reza Mansouri, Mahmood Naderan-Tahan, and Mohammad Javad Rashti. A semi-supervised learning method
for fake news detection in social media. In 2020 28th Iranian Conference on Electrical Engineering (ICEE),
2020 28th Iranian Conference on Electrical Engineering (ICEE), pages 1–5. IEEE, 2020.
[87] Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, and Kazuya Takeda.
Conformer-based sound event detection with semi-supervised learning and data augmentation. dim, 1:4, 2020.
[88] Yang Zhou, Yubo Chen, Jun Zhao, Yin Wu, Jiexin Xu, and Jinlong Li. What the role is vs. what plays the role:
Semi-supervised event argument extraction via dual question answering. In Proceedings of the AAAI Conference
on Artificial Intelligence, volume 35 of Proceedings of the AAAI Conference on Artificial Intelligence, pages
14638–14646, 2021.
[89] Qi Chen, Wei Wang, Kaizhu Huang, Suparna De, and Frans Coenen. Multi-modal generative adversarial
networks for traffic event detection in smart cities. Expert Systems with Applications, 177:114939, 2021.
[90] Kevin Reschke, Martin Jankowiak, Mihai Surdeanu, Christopher D. Manning, and Dan Jurafsky. Event extraction
using distant supervision. In Proceedings of the Ninth International Conference on Language Resources
and Evaluation (LREC’14), Proceedings of the Ninth International Conference on Language Resources and
Evaluation (LREC’14), pages 4527–4531, 2014.
[91] Hang Yang, Yubo Chen, Kang Liu, Yang Xiao, and Jun Zhao. Dcfee: A document-level chinese financial
event extraction system based on automatically labeled training data. In Proceedings of ACL 2018, System
Demonstrations, Proceedings of ACL 2018, System Demonstrations, pages 50–55, 2018.
[92] Xinyu Zuo, Yubo Chen, Kang Liu, and Jun Zhao. Knowdis: Knowledge enhanced data augmentation for event
causality detection via distant supervision. In Proceedings of the 28th International Conference on Computational
Linguistics, Proceedings of the 28th International Conference on Computational Linguistics, pages 1544–1550,
2020.
[93] Reem Alrashdi and Simon O’Keefe. Automatic labeling of tweets for crisis response using distant supervision.
In Companion Proceedings of the Web Conference 2020, Companion Proceedings of the Web Conference 2020,
pages 418–425, 2020.
[94] Nada Boudjellal, Huaping Zhang, Asif Khan, and Arshad Ahmad. Biomedical relation extraction using distant
supervision. Scientific Programming, 2020, 2020.
[95] Jun Araki and Teruko Mitamura. Open-domain event detection using distant supervision. In Proceedings of the
27th International Conference on Computational Linguistics, Proceedings of the 27th International Conference
on Computational Linguistics, pages 878–891, 2018.
[96] Zihao Fu, Lidong Bing, and Wai Lam. Open domain event text generation. In Proceedings of the AAAI
Conference on Artificial Intelligence, volume 34 of Proceedings of the AAAI Conference on Artificial Intelligence,
pages 7748–7755, 2020.
[97] Lara Martin, Prithviraj Ammanabrolu, Xinyu Wang, William Hancock, Shruti Singh, Brent Harrison, and Mark
Riedl. Event representations for automated story generation with deep neural nets. In Proceedings of the AAAI
Conference on Artificial Intelligence, volume 32 of Proceedings of the AAAI Conference on Artificial Intelligence,
2018.
[98] Minh Triet Chau, Diego Esteves, and Jens Lehmann. A neural-based model to predict the future natural gas
market price through open-domain event extraction. In CLEOPATRA@ ESWC, CLEOPATRA@ ESWC, pages
17–31, 2020.
[99] Jiaming Shen, Yunyi Zhang, Heng Ji, and Jiawei Han. Corpus-based open-domain event type induction. arXiv
preprint arXiv:2109.03322, 2021.
[100] Deyu Zhou, Xuan Zhang, and Yulan He. Event extraction from twitter using non-parametric bayesian mixture
model with word embeddings. In Proceedings of the 15th Conference of the European Chapter of the Association
for Computational Linguistics: Volume 1, Long Papers, Proceedings of the 15th Conference of the European
Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 808–817, 2017.
22
A PREPRINT : A N OVERVIEW OF EVENT EXTRACTION AND ITS APPLICATIONS - N OVEMBER 8, 2021
[101] Béatrice Arnulphy, Xavier Tannier, and Anne Vilnat. Automatically generated noun lexicons for event extrac-
tion. In International Conference on Intelligent Text Processing and Computational Linguistics, International
Conference on Intelligent Text Processing and Computational Linguistics, pages 219–231. Springer, 2012.
[102] Liat Ein Dor, Ariel Gera, Orith Toledo-Ronen, Alon Halfon, Benjamin Sznajder, Lena Dankin, Yonatan Bilu,
Yoav Katz, and Noam Slonim. Financial event extraction using wikipedia-based weak supervision. In Proceedings
of the Second Workshop on Economics and Natural Language Processing, Proceedings of the Second Workshop
on Economics and Natural Language Processing, pages 10–15, 2019.
[103] David Bamman, Brendan O Connor, and Noah A. Smith. Learning latent personas of film characters. In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers), Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1:
Long Papers), pages 352–361, 2013.
[104] Sasa Petrovic, Miles Osborne, Richard McCreadie, Craig Macdonald, Iadh Ounis, and Luke Shrimpton. Can
twitter replace newswire for breaking news? In Seventh international AAAI conference on weblogs and social
media, Seventh international AAAI conference on weblogs and social media, 2013.
[105] Yaroslav Ganin and Victor Lempitsky. Unsupervised domain adaptation by backpropagation. In International
conference on machine learning, International conference on machine learning, pages 1180–1189. PMLR, 2015.
[106] Quan Yuan, Xiang Ren, Wenqi He, Chao Zhang, Xinhe Geng, Lifu Huang, Heng Ji, Chin-Yew Lin, and Jiawei
Han. Open-schema event profiling for massive news corpora. In Proceedings of the 27th ACM International
Conference on Information and Knowledge Management, Proceedings of the 27th ACM International Conference
on Information and Knowledge Management, pages 587–596, 2018.
[107] Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, and Clare Voss. Cross-lingual
structure transfer for relation and event extraction. In Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing
(EMNLP-IJCNLP), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing
and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 313–325,
2019.
23