key: cord-0214701-hx792uwf
authors: Han, Songqiao; Huang, Hailiang; Liu, Jiangwei; Xiao, Shengsheng
title: American Hate Crime Trends Prediction with Event Extraction
date: 2021-11-09
journal: nan
DOI: nan
sha: 26cac4ddb79b6236b3793c0350cfd05e49106e09
doc_id: 214701
cord_uid: hx792uwf

Social media platforms may provide potential space for discourses that contain hate speech, and even worse, can act as a propagation mechanism for hate crimes. The FBI's Uniform Crime Reporting (UCR) Program collects hate crime data and releases statistic report yearly. These statistics provide information in determining national hate crime trends. The statistics can also provide valuable holistic and strategic insight for law enforcement agencies or justify lawmakers for specific legislation. However, the reports are mostly released next year and lag behind many immediate needs. Recent research mainly focuses on hate speech detection in social media text or empirical studies on the impact of a confirmed crime. This paper proposes a framework that first utilizes text mining techniques to extract hate crime events from New York Times news, then uses the results to facilitate predicting American national-level and state-level hate crime trends. Experimental results show that our method can significantly enhance the prediction performance compared with time series or regression methods without event-related factors. Our framework broadens the methods of national-level and state-level hate crime trends prediction.

People increasingly communicate opinions and share information through social network platforms such as online forums, blogs, Facebook, etc. Although many of the interactions among the consumer-generated content (CGC) are positive conversations, the proportion of full content containing abusive and hateful language is significantly increasing. Concentrated outbreaks of hate language on certain activities can lead to hate crimes. For example, due to Trump's inflammatory rhetoric through the political campaign, relative hate crimes such as anti-muslim or anti-immigrant hate crimes significantly rise compared to that several years before 1 .

The FBI's UCR Program describes hate crime as "a committed criminal offense that is motivated, in whole or in part, by the offender's bias(es) against a race, religion, disability, sexual orientation, ethnicity, gender, or gender identity" 2 . We review the literature on hate crime study and find two prevailing genres: econometric and computer science genres.

(1) The econometric genre focuses on empirical studies on the correlation between hate crimes and their affecting factors. For example, [1] studied the relationship between depression, impulsivity, or post-traumatic stress and sexual orientation minority adults' victimization-suicide risk. The results set the backdrop for sexual orientation minority health risk and suicide prevention policy research. However, this kind of work mainly focuses on one category of hate crime analysis. There is limited research on national or local hate crime trend prediction. (2) The computer science genre focuses on hate speech detection, which assists in predicting and preventing cybercrimes. Keywords-based approaches and deep-learning-based approaches are two mainstream methods for hate speech detection. For example, [2] created a lexicon related to hate speech and used semantic features to detect the presence of hate speech in web discourses. [3] introduced a novel neural network based on the pre-trained language model BERT (Bidirectional Encoder Representations from Transformers) to identify hateful contents. However, there is limited research utilizing computer science technologies to predict national or local hate crime trends.

Unlike the above research, we first utilize the deep-learning method to extract hate crime events from New York Times news reports. Then we construct three event-related factors and add them into our econometric regression model to predict the FBI hate crime trends. Our study is inspired by previous work [4] that used event extraction technologies to report local hate crimes not covered by the FBI's hate crime reports. We adopt this event extraction approach, train the event extraction model on Patch corpus, and then apply the trained model to identify incidents of hate crime reported in the New York Times corpus. We build a time series model and a series of regression models as our base models. After adding event-related factors into our regression model, we estimate the parameters by Maximum Likelihood Estimation (MLE) and predict the FBI hate crime trends. The experimental results demonstrate that event-related factors can significantly enhance the model performance. Our framework and findings can provide valuable holistic and strategic insight for law enforcement agencies and justify specific legislation in advance.

The remainder of this paper is organized as follows. We review related work in section 2 and introduce the data and processing in section 3. We propose our framework and introduce its modules in section 4. Section 5 reports experimental results and discusses the shortcomings and future work, followed by the conclusion in Section 6.

To prohibit hate speech, most social media platforms have established many rules to detect whether a post or reply contains hate speeches [5] . Only automatic approaches can be qualified for this task due to massive posts or replies. We review the research on automatic hate speech detection and classify it into two main categories: keyword-based and machine learning approaches.

Keyword-based approaches typically maintain a dictionary or database of hate speech ontologies and their related expressions, then utilize this dictionary to identify whether a post contains hateful keywords [2] . They are the most straightforward and efficient unsupervised methods to deal with this problem. However, they still have many limitations. For example, they cannot identify nuanced expressions (e.g., figurative expression) or contents containing hate speech but not using specific keywords in the maintained dictionary (e.g., abbreviation).

Machine learning approaches are currently the most popular methods in automatic hate speech detection. They usually treat hate speech detection as a text classifying task. Once the corpus is annotated by labeling whether a content contains hate speech or not, the standard text classifying machine learning methods can be adapted to hate speech detection. For example, [3] introduced a novel transfer learning approach based on an existing pre-trained language model BERT to detect hateful speech in social media content automatically.

Event Extraction (EE) is an advanced form of Information Extraction (IE) that handles textual content or relations between entities, often performed after executing a series of initial NLP steps [6] . Event extraction from news articles is a commonly required prerequisite for various tasks, such as article summarization, article clustering, or news aggregation [7] . From the view of technical implementation maps in closed domain event extraction, event extraction approaches include pattern matching methods and machine-learning-based methods (including deep-learning methods).

Pattern matching methods usually predefined patterns or schemas. Its advantages involve needing less training data, while its disadvantages involve defining and maintaining patterns. For example, [7] improved the universal EE model (Giveme5W1H) to detect main events in an article, which used syntactic and domain-specific rules to extract the relevant phrases to answer 5W1H questions automatically: who did what, when, where, why and how. [8] proposed a hybrid collaborative filtering method for event recommendation in event-based social networks.

The advantage of machine learning or deep learning methods can automatically and effectively extract the significant features in the text. For instance, [4] trained a deep learning model to perform event extraction. They detected hate crimes on hyper-local news articles, analyzed the hate crime trend, and compared the results with the FBI reports.

The results provided a lower-bound estimation on the occurrence frequency of hate crimes, especially in the cities not covered in FBI reports. In our study, we adopt this approach to extract hate crime events.

Many empirical studies focus on the effectiveness of certain hate crimes or the relation between a hate crime and a specific event [9] . For example, [10] studied the psychological sequelae of hate-crime victimization among lesbian, gay, and bisexual adults. The result showed that compared with other recent crime victims, lesbian and gay hate-crime survivors manifested significantly more depression, anger, anxiety, and post-traumatic stress symptoms. The findings help highlight the importance of recognizing hate-crime survivors' special needs in clinical settings and public policies. Another recent example is about how the Covid-19 Pandemic sparked racial animus in the United States. [11] collected Google searches and Twitter posts that contained anti-Asian racial hate speech to estimate the effect of the Covid-19 pandemic on racial animus. The results suggest that racial animus can be effectively curbed by deemphasizing the connection between the disease and a particular racial group.

We use Patch Hate Crime Dataset 3 [4] to train the event extraction model. The dataset was original introduced by Mostafazadeh Davani and his colleagues. They collected hyper-local news from the Patch 4 website and annotated a subset of this corpus. The annotated labels consisted of two aspects: one indicated whether the news was a hate crime event and, if it was, which kind of the hate crime attributes it belonged to. The corpus contains 5171 samples with 1979 positive samples and 3192 negative samples.

The New York Times provides API 5 to get its historical news reports month by month, returning titles, abstracts, keywords, URL, publish date, etc. However, the results do not contain news content. We utilize web crawler technology to collect the news contents through the corresponding URL from January 2007 to December 2020. Finally, there are 165913 news reports left after removing irrelevant categories or news with too short content. We grouped the news reports by quarter, counted the total reports number in each group, and lastly named this time series news_num, with an average of 2962 reports per quarter. The previously trained hate crime model is performed on this corpus to detect whether a piece of news reports a hate crime event. We count the predicted result list and name this time series event_detected_num.

We collect hate crimes data reported to the FBI in the United States from 2007 to 2019. The statistics 6 reports hate crime incidents per bias motivation and quarter by state federal and agency. It also reports the population in the involved state federal or agency. It may be helpful to look not just at annual hate crime numbers but instead quarterly statistics. We plot the quarterly trends in Figure 1 by breaking down the FBI hate crime data by quarter.

Cyclical patterns emerging in Figure 1 mean that the quarterly hate crime time series may have a seasonal trend. Augmented Dickey-Fuller Test (ADF test) shows it has a unit root that means it is a nonstationary time series. By decomposing this time series into trend, seasonal, and irregular components using the moving average method, we get its corresponding components and plot them in Figure 2 . Ljung-Box Tests results (P _value = 0.005) on irregular components show it has been white noises and thus demonstrates the decomposing adequacy. Next, we would build models on trended data. We estimate the parameters using the data between 2007 quarter 1 and 2018 quarter 4 and forecast the hate crimes between 2019 quarter 1 and 2019 quarter 4. 

In this section, we introduce our hate crime prediction framework. As illustrated in Figure 3 , there are two kinds of models. The left part is the Time Series model and the right is our approach which contains two main components: Event Extraction Module and Regression Module. Time Series Module uses time series theory to predict the FBI hate crime trends and serves as one of our base models. Event Extraction Module extracts event types and attributes, the results of which are used to constructed event-related factors. Regression Module integrates event-related factors and other predictive factors through regressive methods to predict FBI hate crime trends. 

Generally, a non-stationary time series can be transformed into a stationary one by differencing operation. Furthermore ARIMA is a typical method to model a stationary time series. A nonseasonal ARIMA model ARIM A(p, d, q) has three integer parameters p, d, q: the order of AR, differencing, and MA, respectively. For example, a time series y t is said to be an ARIM A(p, 1, q) process if the transformation c t = ∆y t =y t −y t−1 = (1 − L)y t follows a stationary and invertible ARM A(p, q) model, where L is a back-shift operator (also named lag operator). A general ARM A(p, q) model is formulated as

where ε t is the white noise process with variance σ 2 , c is intercept term, α i and θ i are coefficients of AR and MA processes.

Considering some financial or economic time series exhibit particular cyclical or periodic behavior, we first check whether a time series needs seasonal adjustment to remove the seasonal effect. In our study, we utilized the stats package in the R program to decompose the quarterly FBI hate crime time series (f bi_num) into the seasonal, trend, and irregular components. Finally, we transform the trend component (f bi_num_noseasonnal) into a stationary time series and build an ARIMA model.

The Event Extraction Module is responsible for extracting event types and attributes, the results of which are used to construct event-related predictive factors. We follow previous work [4] and adapt Multi-Instance Learning (MIL) method developed by [12] to extract hate crime events. As shown in Figure 4 , the event extraction module has three main essential components.

The first component is responsible for generating sentence embeddings which are regarded as local features of an article. The BiLSTM layer is used to learn sequence information, considering the input words' past and future context information. The second component consists of a CNN layer and a Pooling layer, which scans the sentence embeddings and generates a lower-dimensional vector embedding, representing the global feature of an article. While the two previous components are responsible for generating local and global features, the third component is responsible for extracting hate crime events, including event types and attributes. Both the event types and event attributes extraction are formulated as classification tasks. The only difference between them lies that event type detection is formulated as a binary-class prediction while event attributes are treated as a multi-label classification task.

We use the Patch Hate Crime dataset to train these two separate models and save the parameters separately. Figure 4 : Architecture of Event Extraction Module.

The previous research suggests hate crime may be related to Strain Theory Criminology, Criminological Theories, or Social-Psychological Theories. We try our best to consider more alternative explainable variables than that previous literature mentioned. Due to some alternative variables are unmeasurable or severely collinear with other alternative factors, we finally left 14 variables according to data exploration(e.g., the collinear test). Furthermore, Table 1 summarizes the alternative explainable factors related to hate crimes.

Criminological strain theory [22] emphasizes that the mismatching between the goals of material possessions and the means available to achieve those goals may contribute to hate crimes or violent crimes. From this view, we utilize the alternative predictors of strain theory and criminological theories left in Table 1 . The final acceptable variables include the unemployment rate, murder and nonnegligent manslaughter rate, rape rate, robbery rate, aggravated assault rate, burglary rate, Black homicide victims, the arrested carrying or possessing weapons, the arrested abusing drug, and total law enforcement employees. We formulate the detrend F BI_N um_N oseas prediction task as:

fbi_num_noseasonnal =α 0 +α 1 * aggravated_assault_rate(−1) +α 2 * arrests_drug_abuse_violations(−1)+α 3 * arrests_weapons(−1) +α 4 * burglary_rate(−1)+α 5 * homicide_victims_black(−1) +α 6 * murder_nonnegligent_manslaughter_rate(−1) +α 7 * population+α 8 * rape_rate(−1)+α 9 * robbery_rate(−1) +α 10 * total_law_enforcement_employees(−1)+α 11 * uner_quar(−1)+ε 2

where ε 2 is the residual error term that represents all variation left unexplained.

The social psychological theory [13] argues that the media coverage, especially sensationalist coverage of spectacular hate crime events, contributes to hate crime contagion. We utilize event extraction technology to extract hate crime events from New York Times to construct social-psychological variables. Once the trained event extraction model predicts the New York Times news data, we obtain the detected event type list. Then we count the number of the news reports and events detected, leading to two time series: New York News articles number (news_num) and Hate Crime prediction list (event_detected_num). Note that the extracted hate crime events are not the same as those mentioned in the FBI's reports. We add these two event-related variables into formula 2, update it as below:

fbi_num_noseasonnal =α 0 +α 1 * aggravated_assault_rate(−1) +α 2 * arrests_drug_abuse_violations(−1)+α 3 * arrests_weapons(−1) +α 4 * burglary_rate(−1)+α 5 * homicide_victims_black(−1) +α 6 * murder_nonnegligent_manslaughter_rate(−1) +α 7 * population+α 8 * rape_rate(−1)+α 9 * robbery_rate(−1) +α 10 * total_law_enforcement_employees(−1)+α 11 * uner_quar(−1) +α 12 * event_detected_num+α 13 * news_num+ε 3

However, the absolute amounts of news_num and event_detected_num may not be sufficient to depict the extent that hate crime reports spread on social media at that time. We construct a relative indicator to character current hate crime transmission level, named hate_reported_index, which is calculated by:

Finally, we integrate social-psychological related variables and criminological strain related variables into one model. Formally, we fulfill the hate crime prediction task by the following equation:

fbi_num_noseasonnal =α 0 +α 1 * aggravated_assault_rate(−1) +α 2 * arrests_drug_abuse_violations(−1)+α 3 * arrests_weapons(−1) +α 4 * burglary_rate(−1)+α 5 * homicide_victims_black(−1) +α 6 * murder_nonnegligent_manslaughter_rate(−1) +α 7 * population+α 8 * rape_rate(−1)+α 9 * robbery_rate(−1) +α 10 * total_law_enforcement_employees(−1)+α 11 * uner_quar(−1) +α 12 * event_detected_num+α 13 * news_num +α 14 * hate_reported_index + ε 3 

This section reports the experimental results, including event extraction results, national hate crime prediction results (time series model and regressive models), and state-level hate crime prediction results (panel data regressive models). We adopt the data from 2007 to 2018 to fit the models, predict the hate crime trends after 2018 and compare them with the latest FBI statistics. The event extraction statistical evaluation metrics include precision, recall, and F1 score, which are calculated as follows: 

Considering RMSE is sensitive to data, MAPE is employed to measure the predicted error as a percentage:

where y t is the actual value,ŷ t presents the predicted value, and N is the number of total observations in test set.

We first train and evaluate the event extraction model on the Patch Hate Crime dataset, which is divided into train set (3619), validation set (513), and test set (1039). The other parameter settings follow [4] . We repeat the experiment 10 times and report the average precision, recall, and F1 in Table 2 .

The performance on the test set is consistent with the results reported in [4] . The trained model is saved to detect whether a New York Times news reports a hate crime event. We count the number of the news reports and events detected by the model quarterly and get two time series: New York News articles number (news_num) and Hate Crime predict list (event_detected_num).

We obtain the f bi_num_noseasonnal time series by removing the seasonal component from the quarterly hate crime time series (f bi_num). ADF test shows that it has a nonlinear trend and is not a stationary sequence. Stationary sequence d_f bß n um_noseasonal is obtained by differencing operation. The AR and MA orders can be identified by the Autocorrelation Function (ACF) and Partial Autocorrelation Correlation Function (PACF) test. The ACF and 1,0) , and the fitted expression is formulated as: fbi_num_noseasonnal = 201.42 + 0.8776 * fbi_num_noseasonnal(−1) (11) with Adjusted R-Squared 0.7415, log-likelihood -283.39, and Dubin-Watson statistics 1.65. We then use this fitted model to predict hate crime trends dynamically; the RMSE and MAPE are 93.11, 4.60 respectively. This model serves as one of our base models (Model 1).

We first fit Model 2 (equation 2), Model 3 (equation 3), and Model 4 (equation 5), then use them to forecast the hate crime values. We report the forecast evaluation metrics in Table 3 . Each line represents a unique model, and each column is an evaluation metric. For convenient comparison, we put Model 1 evaluation together into Table 3 .

From Table 3 , we can get three observations. First, regressive models outperform the ARIMA model on almost all evaluation metrics. Second, among the regressive models, the models (Model 3 and Model 4) with event-related variables outperform the ones without event-related variables (Model 2). Third, Model 4 (RMSE is 55.8065 and MAPE is 2.1355) gets the best performance among the three regressive models. It confirms what we suspect that the two directly event-extracted variables (news_num and event_detected_num) may not be sufficient, and the relative indicator hate_reported_index we constructed is a beneficial supplementary prediction factor. The experiment results show that event extracted variables can effectively improve the hate crime prediction performance and demonstrate the effectiveness of the event extraction module in our framework.

Considering that the ARIMA model (Model 1) has the highest Adjusted R-Square, we build a model (Model 5) integrating the ARIMA model and the regressive model (Model 4) to test whether an integrated model performs best. The experimental results are also exhibited in Table 3 . It shows that although the fitting evaluation metrics outperform others, it does not match for Model 4 on forecast evaluation metrics. The reason probably is that the hate crime number of quarter 1, 2019 declined precipitously. The inertia of ARIMA part contributes to more errors leading Model 5 has suboptimal evaluation on the test set.

We visualize the performance by plotting the detrend hate crimes and forecast values predicted in Figure 5 . It straightforwardly shows that the models considering the event extracted factors perform better than those without them.

However, we note that the gap between the predicted hate crime numbers and the actual values is a bit large between late 2016 and 2017. We find that the "Trump Effect" could well explain this phenomenon. According to Stephen and Griffin's research [23] , they concluded that it was not just Trump's inflammatory rhetoric throughout the political campaign from November of 2016 that caused hate crimes to increase. Rather, they also argued that Trump's subsequent election as President of the USA validated that rhetoric in perpetrators' eyes and fueled the hate crime surge.

Our approach can be easily adapted to predict state-level hate crime trends. We utilize the panel data analysis method to fulfill this task. The process includes five main steps. a). Detect which state is involved in the news reporting hate crimes. b). Prepare data for panel data regressive analysis. c). Perform Fix/Random effects test to determine the proper form of panel data equations. d). Estimate the parameters and forecast. e). Forecast state-level hate crime trends.

We first perform Named Entity Recognition (NER) to identify which states are involved in the hate crimes. To enhance recognition accuracy, we collect 730000 American place or institute names and map these names to their corresponding states. We sample 500 news and manually annotate the states where the hate crimes take place. Two annotators and our state recognization program achieved a 0.79 agreement on Cohen's Kappa. This favorable performance promises that our state recognition method is qualified for preparing reliable panel data variables. The results (including RMSE, MAPE, and Paired Samples Test) demonstrate the effectiveness and generalization of the event extraction module of our framework. It could not only be used to predict national hate crime trends but also state-level trends.

Existing literature on causes of hate crime may include prejudice theory, criminological theory, social-psychological theory. Unfortunately, there is limited research on the general reason causing a hate crime. One shortcoming of this study is that we only introduce some criminological and social-psychological variables as our predictive factors. For convenient comparison, we focused on demonstrating event-extraction-related factors could help enhance the prediction ability, thus ignored many factors of other theories, e.g., prejudice theory. We will try to consider more factors introduced in other literature research and verify their effectiveness in the future.

In future work, we will collect more data and do breakpoint tests to verify the "Trump effect", encode this variable into a two-stage model, and check whether it would enhance the model performance. We are also continuing to design some new models, including Copula-based models and neural network models.

Hate crimes continue to be a pervasive problem in the United States. The FBI's UCR Program collects national hate crime data and publishes statistics reports every year, but the statistics are mainly released more than one year later. It is important for practitioners, policy-makers, researchers to grasp the national and local hate crime trends in time. We propose a framework by introducing NLP technologies, especially event extraction technologies, into national hate crime trend prediction to alleviate the challenge. This framework also can be easily adapted to predict state-level hate crime trends by integrating panel data analysis. The experiment results show that event extracted factors significantly enhance the prediction performance. Our framework broadens the approaches of hate crime trend prediction. It can help provide valuable holistic and strategic insight for law enforcement agencies and justify specific legislation in advance for lawmakers.

On hate crime victimization: Rates, types, and links with suicide risk among sexual orientation minority special interest group members

A lexicon-based approach for hate speech detection

A bert-based transfer learning approach for hate speech detection in online social media

Reporting the unreported: Event extraction for analyzing the local representation of hate crimes

Analyzing the targets of hate in online social media

A survey of event extraction methods from text for decision support systems

Giveme5w1h: A universal system for extracting main events from news articles

Semantic-enhanced and context-aware hybrid collaborative filtering for event recommendation in event-based social networks

Race, ethnicity and national origin-based discrimination in social media and hate crimes across 100 us cities

Psychological sequelae of hate-crime victimization among lesbian, gay, and bisexual adults

From fear to hate: How the covid-19 pandemic sparks racial animus in the united states

A multiple instance learning framework for identifying key sentences and detecting events

Hate crime: An emergent research agenda. Annual review of sociology

Understanding hate crimes: Sociological and criminological perspectives

A general theories of hate crime? strain, doing difference and self control

A common target: anti-jewish hate crime in new york city communities

Social learning and social control in the off-and online pathways to hate crime and terrorist violence

the name of hate: Understanding hate crimes. Routledge

Genocide, mass atrocity, and theories of crime: Unlocking criminology's potential

Social psychological perspectives on trump supporters

The contribution of social-psychological strain theory to the explanation of crime and delinquency

Understanding hate crimes: Perspectives from the wider social sciences

The effect of president trump's election on hate crimes. Available at SSRN 3102652