key: cord-0681061-prg0hszk
authors: Sarthak,; Shukla, Shikhar; Mittal, Govind; Arya, Karm Veer
title: Detecting Hostile Posts using Relational Graph Convolutional Network
date: 2021-01-10
journal: nan
DOI: nan
sha: 2b140e59c5e6bf08b9c3c5693e0b4cedafe7649f
doc_id: 681061
cord_uid: prg0hszk

This work is based on the submission to the competition Hindi Constraint conducted by AAAI@2021 for detection of hostile posts in Hindi on social media platforms. Here, a model is presented for detection and classification of hostile posts and further classify into fake, offensive, hate and defamation using Relational Graph Convolutional Networks. Unlike other existing work, our approach is focused on using semantic meaning along with contextutal information for better classification. The results from AAAI@2021 indicates that the proposed model is performing at par with Google's XLM-RoBERTa on the given dataset. Our best submission with RGCN achieves an F1 score of 0.97 (7th Rank) on coarse-grained evaluation and achieved best performance on identifying fake posts. Among all submissions to the challenge, our classification system with XLM-Roberta secured 2nd rank on fine-grained classification.

Though Hindi is one of the most prominent languages in online communities, there has been a lack of research when it comes to identifying toxic and fake posts in Hindi. The need for such solutions has been felt more acutely in the light of the Covid pandemic. The online forums have been flooded with unproven remedies to cure and prevent the spread of the virus. False information regarding government policies such as lockdown has often led to panic and shortage of essential supplies in the communities. On the other hand, hate and offensive posts on online social forums targeting specific groups and religious beliefs have sometimes sparked violent incidents. Therefore, it is imperative to develop systems that can filter out such posts. Through the "Hostile Post Detection in Hindi"

These two authors contributed equally arXiv:2101.03485v1 [cs.CL] 10 Jan 2021 task, the organizers have provided a manually annotated dataset [2] which can be immensely useful in combating the above-mentioned challenges.

The paper is organized as follows. Section 2 describes related work. Section 3 describes the background work on Relational Graph Convolutional Netowrks. Section 4 describes the methodologies of proposed models. Sections 5 presents analysis on the performance of the proposed models. Paper is concluded is Section 6.

Several people have presented different approaches on how to classify toxic comments on different datasets presented by Google Jigsaw during a Kaggle Competition 1 and Twitter [6] . Several authors [1, 8] suggested an ensemble learning method which outperformed baselines by 7%. However, another dataset 2 was released later as models were getting biased towards certain identity words(gay, black, etc.). The top solutions used transformers to achieve state-of-the-art results on this dataset.

However, all the models designed for them would work efficiently for classifying posts in English. Priya et al. [15] compiled a Hindi-English code-mixed dataset for detecting hate speech in social media communications. They also proposed an architecture using Word2vec and FastText embeddings for classification. For the given dataset, organizers [2] presented a baseline model of SVM which used multilingual embeddings as input.

Recently, novel architectures [23, 24] using Graph Convolutional Networks [12, 16] have achieved state-of-the-art results for text classification on GAP [21] , Ohsumed [9] , 20ng [7] , and Reuters [7] datasets. They generalize better which motivated us to use them for this task. Also, to the best of our knowledge, this is the first attempt at using Relational Graph Convolutional Networks for classifying hostile comments in Hindi.

The recently proposed XLM-Roberta [5] , a multilingual model, outperforms all the other transformer architectures on classification tasks such as Natural Language Inference (NLI) and Named Entity Recognition (NER) in multiple languages. It was also a part of the top performing solutions in a recent Kaggle challenge 3 on identifying hostile posts in multiple languages. This motivated us to finetune the model on our dataset.

GCNs [12, 4] are used for performing convolution over graphs. GCNs, with and without labeled edges, have been used in many text classification tasks [20, 24, 23] ) and achieved state-of-the-art results. Let G = (V , E) be a directed graph where nodes v i ∈ V and edges (v i , v j ) ∈ E with label r. The hidden state of each node v i is represented as h i , where h i is a d 0 -dimensional vector. Every node v i aggregates neighbours information h j in it as described in Eq. (1) .

where ψ denotes an activation function, where c i,r represents a normalization constant, N r i represents the degree of the node i, and W (l) r represents the weight matrix under relation r.

In many cases, not all edges might be relevant to the model. The concept of Edge importance was introduced by Marcheggiani et al. [16] for helping the model in identifying the erroneous or irrelevant edges. The gating value for an edge is computed as:

where σ(·) is the sigmoid function, w k (u,v) and b k (u,v) are gating parameters at layer l. Finally, the gated R-GCN embedding or hidden state of node for our model is computed as:

Capturing syntactic information in a sentence helps in capturing the semantic meaning and has often been found to be useful in many tasks [24, 25] when fed into models. Fig. 1 shows a syntactic graph of a sentence. 

In this section, we present three model architectures. First subsection elaborates on proposed architecture using R-GCN and multilingual BERT. Next two subsections discuss on how to finetune pretrained architectures for this task.

Our deep learning-based multilabel classification architecture (as seen in Fig. 2) is inspired by the approach followed by Xu et al. [24] and Yao et al. [25] . It consists of two layers in parallel for capturing the contextual 4 and semantic information of the sentence. -Context Embedding: In this layer, the original Hindi text is fed into bertbase-multilingual-cased and pooler output is captured. It so happens, that it represents the [CLS] token in BERT which is prepended to every sentence. The embedding of this token is obtained by pooling, or making it dependent on each token in a sentence, and finetuning it using sequence classification tasks. It captures the contextual information better than semantic context.

-Syntactic Embedding: We were not able to find dependency parsers for Hindi text. So, we translated data to English and did dependency parsing on them using spaCy [10] . Dependency parsing is done for each sentence giving us a labeled directed graph with nodes as tokens in a sentence. As mentioned in [16] and [23] , we also use three types of relationships for edge labels: heads to dependents (u, v), dependents to heads (u, v) −1 and self-loops (u, u) where (u, v) ∈ E and u, v ∈ V. Instead of random initialisation of hidden states of a node, we used embeddings of each token obtained from pretrained Bert-Large-Uncased. This will help the gated R-GCN in capturing semantic taskspecific embeddings. Finally, we obtain sentence semantic embedding h mean by average pooling of every token embedding.

The Pooler output from BERT and pooled output from Gated R-GCN is concatenated. The reason for doing this is two-fold. First, it captures both the semantic and contextual information of the text. Secondly, some contextual information might get lost in translation, which might impact semantic information as well. As evident by Fig. 3 , the semantic meaning is almost intact but a word is incorrectly translated(highlighted in green, Hindi text mentions about a "well", but it's translated as a "queue"). The concatenated output is then passed through a fully connected layer for final prediction.

We experimented RGCN with pooler outputs from other architectures as well. The result for which can be seen in table 1. The given dataset had tweets that contained unwanted tokens such as tags and shortened URLs. We cleaned these using the "fixhtml" rule, available as part of the Fastai[11] text processing package.

-Tokenization: We used sentencepiece with two tokenization schemes:

• UnigramTokenizer : We used a vocabulary size of 20k with unigram language model [13] . It starts out with a seed vocabulary set, and keeps on dropping a fixed percentage of subwords 5 to optimize the marginal likelihood till the desired vocabulary size is reached as shown in Eq. (5) .

where , M is a subword sequence x = (x 1 , x 2 , ...x M ), V is the vocabulary, X is the corpus, S(X) represents a possible set of subword sequences, |D| represents number of sets in S.

• Byte Pair Encoding: BPE [17] starts out by computing the frequency of characters in the text and iteratively merges the most common pair of tokens till that point. The recently merged tokens are added to the initial list and the frequency of each token is recalculated. This is repeated till the set vocabulary limit is reached. We got better results with BPE as compared to Unigram tokenization as observed from Table 1 . We used a vocabulary size of 20k. [14] and AWD-QuasiRNN [3] models from scratch. In several tasks, these two models have performed comparable to the more recent transformer architectures. These networks use DropConnect, which sets a randomly selected subset of weights to zero. QRNN has alternate recurrent and convolutional layers, which speeds up training and testing because of parallel operations. They also outperform stacked LSTMs of similar size (see Table 1 ).

We used pretrained multilingual transformer models from Huggingface library [22] . We finetuned XLM-Roberta Large model which has been trained on Com-monCrawl data in 100 languages, with masked language modeling objective on text sampled from multiple languages.

-Dataset augmentation: We also experimented with finetuning transformers on text from Kaggle's Toxic Comment Classification Challenge. Then we translated the Hindi Hostile Post dataset to English using Google Translate and used these fine-tuned models for further training and classification. Separately, we also augmented our Hindi dataset with data from HASOC (2019) challenge and finetuned XLM-Roberta on this dataset.

-Pseudo-labeling: We used the predictions made by our trained models on the test set as soft labels and retrained the model after upsampling this data. Data augmentation and pseudo-labeling didn't provide any performance boost to our models.

The training and validation datasets [2] were provided as part of the competition, which were split into hostile and non-hostile classes. The hostile posts were further divided into fake, hate, offensive and defamation categories. The training and validation sets had 5728 and 811 samples respectively. The comments had an average length of 30 tokens. We have presented word clouds for translated data of each class in Fig. 5 . 

The metric for evaluation in this task was F1 score. Since the data suffers from class imbalance and its important to maintain a balance between precision and recall, weighted F1 scores become a better metric for evaluation. We evaluate our models 6 on the test set and the F1 scores are mentioned in Table 1 . We extracted last layer embedding from the trained XLM-Roberta model and visualized it on validation set, after using PCA as shown in Fig. 7a . The plot illustrates why the model performs well in classifying non-hostile and fake posts. Also, the embeddings for hate, offensive and defamation posts are clustered together, thus resulting in a poor performance on those classes. In Fig. 6 , we visualize the salience maps of a few samples from the validation set, using Google's Language Interpretability Tool [19] with XLM-Roberta. It depicts each 6 https://github.com/shikhar00778/constraint21 7 bestfit ai 3 rd submission 8 bestfit ai 2 nd submission token's contribution to the final prediction made by the model. The tokens in red are less important, while the model focuses more on the blue tokens. The plot 6a demonstrates how the model has learnt to attach more weight to words such as "Hindu" and "Muslim". These words tend to appear often in hostile posts in the given dataset. Sentence embeddings from RGCN layer, multilingual BERT and concatenated embeddings from the trained RGCN-BERT model were also extracted and visualized on test set using T-SNE as shown in Fig. 7 . The plot 7c illustrates why the model performs well in achieving high coarse grained score. Fig. 7b illustrates how multilingual BERT can clearly classify fake news from other classes. Fig. 7d illustrates the final sentence embeddings obtained after concatenating the RGCN and multilingual BERT embeddings. Fig 8 presents the confusion matrix of trained RGCN+BERT model on test dataset for all the categories. Overall, model can clearly identify non-hostile and fake posts due to the advantages that RGCN and BERT embeddings carry, and why it has outperformed all other submissions for classifying fake posts. 

In this paper, a model is presented for detecting hostile posts and further classifying them into fake, hate, offensive and defamation. By combining semantic information along with the contextual information leading to improved performance was observed. The proposed model achieved (5 th Rank) on fine-grained evaluation in Hindi Constraint organized by AAAI@2021. RGCN with Bert performed better than all the other teams' submissions to the challenge on classifying fake posts, achieving an F1 score of 82.4. Limitation of the proposed model is that it is not able to classify defamation posts with the best F1 score of 44.9 across all our team's submissions. Future Scope: The work can further be explored by doing syntactic analysis directly on Hindi text rather than translated text. Furthermore, finetuning multilingual T5 for this task can also yield better results. Also, ensembling the results of XLM-Roberta and RGCN+Bert architecture might give better results since the former performed well on defamation while the latter achieved stateof-the-art results for fake news detection.

Challenges for toxic comment classification: An in-depth error analysis

Hostility detection dataset in hindi

Quasi-recurrent neural networks

Spectral networks and locally connected networks on graphs

Unsupervised cross-lingual representation learning at scale

Automated hate speech detection and the problem of offensive language

UCI machine learning repository

Detecting online hate speech using context aware models

Ohsumed: An interactive retrieval evaluation and new large test collection for research

spaCy: Industrial-strength Natural Language Processing in Python

Fastai: A layered api for deep learning

Semi-supervised classification with graph convolutional networks

Subword regularization: Improving neural network translation models with multiple subword candidates

Regularizing and optimizing lstm language models

A comparative study of different state-of-the-art hate speech detection methods in hindi-english code-mixed data

Modeling relational data with graph convolutional networks

Neural machine translation of rare words with subword units

The language interpretability tool: Extensible, interactive visualizations and analysis for NLP models

Dating documents using graph convolution networks

Mind the gap: A balanced corpus of gendered ambiguou

Huggingface's transformers: State-of-the

Graph convolutional networks for text classification

Look again at the syntax: Relational graph convolutional network for gendered ambiguous pronoun resolution

Semantics-aware bert for language understanding