AUTHOR(S): 

 
TITLE:  

 
YEAR:  
 

Publisher citation: 

 
OpenAIR citation: 

 
Publisher copyright statement: 

 
OpenAIR takedown statement: 

 
 This publication is made 
freely available under 
________ open access. 

 
This is the ______________________ version of an article originally published by ____________________________ 
in __________________________________________________________________________________________ 
(ISSN _________; eISSN __________). 

This publication is distributed under a CC ____________ license. 

____________________________________________________

 
Section 6 of the “Repository policy for OpenAIR @ RGU” (available from http://www.rgu.ac.uk/staff-and-current-
students/library/library-policies/repository-policies) provides guidance on the criteria under which RGU will 
consider withdrawing material from OpenAIR. If you believe that this item is subject to any of these criteria, or for 
any other reason should not be held on OpenAIR, then please contact openair-help@rgu.ac.uk with the details of 
the item and the nature of your complaint. 

 
Emotion-Aware Polarity Lexicons for Twitter
Sentiment Analysis

Anil Bandhakavi, Nirmalie Wiratunga, Stewart Massie and Deepak P.

Abstract Theoretical frameworks in psychology map the relationships between
emotions and sentiments. In this paper we study the role of such mapping for com-
putational emotion detection from text (e.g. social media) with a aim to understand
the usefulness of an emotion-rich corpus of documents (e.g. tweets) to learn polar-
ity lexicons for sentiment analysis. We propose two different methods that lever-
age a corpus of emotion-labelled tweets to learn word-polarity lexicons. The pro-
posed methods model the emotion corpus using a generative unigram mixture model
(UMM), combined with the emotion-sentiment mapping proposed in Psychology
for automated generation of word-polarity lexicons that capture emotion-rich vo-
cabulary. We comparatively evaluate the quality of the proposed mixture model in
learning emotion-aware sentiment lexicons with those generated using supervised
latent dirichlet allocation (sLDA) and word-document frequency (WDF) statistics.
Sentiment analysis experiments on benchmark Twitter data sets confirm the quality
of our proposed lexicons. Further a comparative analysis with sLDA, WDF based
emotion-aware lexicons and standard sentiment lexicons that are agnostic to emo-
tion knowledge suggest that the proposed lexicons lead to a significantly better per-
formance in both sentiment classification and sentiment intensity prediction tasks.

1 Introduction

Sentiment analysis concerns the computational study of natural language text (e.g.
words, sentences and documents) in order to identify and effectively quantify its
polarity (i.e positive or negative) [28]. Sentiment lexicons are the most popular re-
sources used for sentiment analysis, since they capture the polarity of a large col-

Anil Bandhakavi, Nirmalie Wiratunga, Stewart Massie
School of Computing, Robert Gordon University, Aberdeen, UK, e-mail: a.s.bandhakavi,
n.wiratunga, s.massie @rgu.ac.uk

Deepak P.
Queen’s University, Belfast, UK, e-mail: deepaksp@acm.org


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

lection of words. These lexicons are either hand-crafted (e.g. opinion lexicon [15],
General Inquirer [36] and MPQA subjectivity lexicon [38]) or generated (e.g. Senti-
WordNet [9] and SenticNet [7]) using linguistic resources such as WordNet [10] and
ConceptNet [20]. However, on social media (e.g. Twitter), text contains special sym-
bols resulting in non-standard spellings, punctuations and capitalization; sequence
of repeating characters and emoticons for which the aforementioned lexicons have
limited or no coverage.

As a result domain-specific sentiment lexicons were developed to capture the in-
formal and creative expressions used on social media to convey sentiment [24, 11].
The extraction of such lexicons is possible with limited effort, due to the abun-
dance of weakly-labelled sentiment data on social media, obtained using emoti-
cons [13, 14]. However, sentiment on social media is not limited to conveying posi-
tivity and negativity. Socio-linguistics suggest that on social media, people express
a wide range of emotions such as anger, fear, joy, sadness etc [6]. Following the
trends in lexicon based sentiment analysis, research in the textual emotion detec-
tion also developed lexicons that can not only capture the emotional orientation of
words [25, 31], but also quantify their emotional intensity [35, 33].

Though research in psychology defines sentiment and emotion differently [26],
it also provides a relationship between them [4]. Further research in emotion clas-
sification [37, 12] demonstrated the usefulness of sentiment features extracted
using a lexicon for document representation. Similarly emoticons used as fea-
tures to represent documents improved sentiment classification [14, 24]. However,
the exploration of emotion knowledge for sentiment analysis is limited to emoti-
cons [14, 16, 17], leaving a host of creative expressions such as emotional hash-
tags (e.g. #loveisbliss), elongated words (e.g. haaaappyy!!!) and their concatenated
variants unexplored. An emotion-corpus crawled on Twitter using seed words for
different emotions as in [23, 37] can potentially serve as a knowledge resource for
sentiment analysis. Adopting such corpora for sentiment analysis, e.g. sentiment
lexicon extraction is particularly interesting, given the challenges involved in devel-
oping effective models which can cope with the lexical variations on social media.

Therefore, in this work we explore the role of a Twitter emotion corpus for ex-
tracting a sentiment lexicon, which can be used to analyse the sentiment of tweets.
We do a qualitative comparison between standard sentiment lexicons that are agnos-
tic to the emotion-knowledge, emotion-aware sentiment lexicons generated using
techniques such as supervised latent dirichlet allocation (slDA) and the proposed
sentiment lexicons. Our contributions in this paper are as follows:

1. We propose two different methods to generate sentiment lexicons from a cor-
pus of emotion-labelled tweets by combining our prior work on domain-specific
emotion lexicon generation [1, 2], with the emotion-sentiment mapping pre-
sented in Psychology (see figure 1) [4]; and

2. We comparatively evaluate the quality of the proposed sentiment lexicons, emotion-
aware sentiment lexicons learnt using sLDA and the standard sentiment lexicons
found in literature through different sentiment analysis tasks: sentiment intensity
prediction and sentiment classification on benchmark Twitter data sets.


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

In the rest of the paper we review related literature in Section 2. In Sections 3 and
4 we formulate the methods to extract sentiment lexicons from an emotion corpus of
tweets. Section 5 presents the baseline methods to extract sentiment lexicons from
an emotion corpus of tweets. In Section 6 we describe our experimental set up and
analyse the results. Section 7 presents our conclusions.

2 Related Work

In this section we review the literature concerning sentiment lexicons, followed by a
review of different emotion theories and their relationship with sentiments proposed
in Psychology.

2.1 Lexicons for Sentiment Analysis
Broadly sentiment lexicons are of two types: hand-crafted and automatic. Hand-
crafted lexicons such as opinion lexicon [15], General Inquirer [36] and MPQA
subjectivity lexicon [38] have human assigned sentiment scores. On the other hand
automatic lexicons are of two types: corpus-based and resource-based. Lexicons
such as SentiWordNet [9] and SenticNet [7] are resource-based, since they are ex-
tracted using linguistic resources such as WordNet [10] and ConceptNet [20].

A common limitation of resource-based and hand-crafted lexicons is that, they
have static vocabulary, making them limitedly effective to mine sentiment on social
media, which is inherently dynamic. Corpus-based lexicons such as in [24, 11],
gauge the corpus level variations in sentiment using statistical models, and are
found to be very effective on social media. Further with the abundance of weakly-
labelled sentiment data on social media, these lexicons can be updated with very low
costs. Similarly research in emotion analysis lead to the development of resource-
based [25, 31] and corpus-based emotion lexicons [35, 33].

Prior research in sentiment analysis developed models that exploit emotion
knowledge, such as emoticons to gain performance improvements [14, 16, 17].
However, other forms of emotion knowledge such as an emotion corpus and the
lexicons learnt from it, could potentially have richer sentiment-relevant informa-
tion, compared to that of emoticons. Therefore it is interesting to study role of such
emotion knowledge for sentiment analysis, in particular for sentiment lexicon gen-
eration and validate its usefulness. Our work focusses on this aspect, by exploiting
an emotion-labelled corpus of tweets to learn sentiment lexicons. We achieve this
by combining our prior work on generative mixture models for lexicon extraction
and the emotion-sentiment mapping provided in psychology.

2.2 Emotion Theories
Research in psychology proposed many emotion theories, wherein each theory or-
ganizes a set of emotions into some structural form (e.g. taxonomy). In the following
sections we detail the most popular emotion theories studied in psychology.


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

2.2.1 Ekman Emotion Theory

Paul Ekman, an American psychologist focused on identifying the most basic set
of emotions that can be expressed distinctly in the form of a facial expression. The
emotions identified as basic by Ekman are anger, fear, joy, sadness, surprise and
disgust [8].

2.2.2 Plutchik’s Emotion Theory

Unlike the Ekman emotion model Plutchik’s emotion model defines eight basic
emotions such as anger, anticipation, disgust, joy, fear, sadness and surprise [30].
These basic emotions are arranged as bipolar pairs namely: joy-sadness, trust-
disgust, fear-anger, surprise-anticipation.

2.2.3 Parrot’s Emotion Theory

Parrot organised emotions in a three level hierarchical structure [29]. The levels
represent primary, secondary and tertiary emotions respectively. Parrot identified
emotions such as love, joy, surprise, anger, sadness and fear, as the primary emo-
tions. Though Ekman and Plutchik emotion models are popular, research in Twitter
emotion detection [37], [32] focussed on emotions that largely overlap with that
of Parrot, given their popular expressiveness on social media. We use the Parrot
emotion-labelled twitter corpus [37] in this study for generating sentiment lexicons.

2.2.4 Emotion-Sentiment Relationship in Psychology

One of the popular approaches for emotion modelling in Psychology is the dimen-
sional approach, wherein each emotion is considered as a point in the continuous
multidimensional space where each aspect or characteristic of an emotion is rep-
resented as a dimension. Affect variability is captured by two dimensions namely
valence and arousal [18]. Valence (pleasure - displeasure) depicts the degree of pos-
itivity or negativity of an emotion. Arousal (activation- deactivation) depicts the
excitement or the strength of an emotion. The dimensional approach depicting par-
rot’s primary emotions in the valence arousal 2D space is shown in Figure 1 [3].

3 Emotion-Aware Models for Sentiment Analysis

In this section we formulate two different methods which utilize a corpus of
emotion-labelled documents for sentiment analysis of text. The first method learns
an emotion lexicon and further transforms it into a sentiment lexicon using the
emotion-sentiment mapping (refer section 2.2) proposed in Psychology. The sec-
ond method on the other hand learns the sentiment labels for the documents in the
emotion corpus using the emotion-sentiment mapping, followed by a sentiment lex-
icon extraction. The two proposed methods are illustrated visually in figures 2a and
2b.


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

Fig. 1: Parrot’s emotions in the valence-arousal plane of the dimensional model

3.1 Emotion Corpus-EmoSentiLex
A simple way to utilize a corpus of emotion-labelled documents, XE for sentiment
analysis is to first learn an emotion lexicon, and further transform it into a sentiment
lexicon. An emotion lexicon U MM EmoLex in our case is a |V|×(k + 1) matrix,
where U MM EmoLex(i, j) is the emotional valence of the ith word in vocabulary V
to the jth emotion in E (set of k emotions) and U MM EmoLex(i,k + 1) corresponds
to its neutral valence (refer section 4). Observe that k emotions are considered and
the k +1 is the neutral class. Further using the emotion-sentiment mapping proposed
in Psychology we transform the emotion lexicon U MM EmoLex into a sentiment
lexicon U MM EmoSentiLex, which is a |V|×1 matrix as follows:

U MM EmoSentiLex(i) = Log
(

∑m∈E+ U MM EmoLex(i,m)
∑n∈E− U MM Emolex(i,n)

)
(1)

where E+ ⊂ E and E− ⊂ E are the set of positive and negative emotions according
to the emotion-sentiment mapping. Further m and n are iterators over the set of
positive and negative emotions. Note that the log scoring assigns a positive value for
words having stronger associations with emotions such as Joy, Surprise and Love
and negative values for words having stronger associations with emotions such as
Anger, Sadness and Fear. Therefore we expect that sentiment knowledge for words
is implicitly captured in an emotion lexicon, which can be easily extracted using this
simple transformation.

Using the above method, any automatically generated emotion lexicon can be
converted into a sentiment lexicon. For example automatic emotion lexicons learnt
from a corpus of emotion labelled tweets using methods such as latent dirichlet al-
location (LDA) can be used to learn emotion-aware sentiment lexicons. We refer


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

(a) Emotion Corpus-EmoSentiLex

(b) Emotion Corpus-SentiLex

Fig. 2: Emotion-Aware Models for Sentiment Analysis

the readers to section 5 for the lexicon generation process using LDA and word-
frequency statistics. Though the above method induces a sentiment lexicon it does
not model the document-sentiment relationships to learn the lexicon, which is im-
portant to quantify word-sentiment associations. Therefore we introduce an alter-
nate method which overcomes this limitation while utilizing an emotion corpus for
sentiment lexicon generation.

3.2 Emotion Corpus-SentiLex
An alternate way to utilize the emotion corpus, XE for sentiment analysis is to trans-
form it into a sentiment corpus, XS by learning the sentiment label for each document
d ∈ XE . This is done by using the emotion-sentiment mapping as follows:

Sentiment(d) =
{

positive if emotion(d)∈ E+
negative if emotion(d)∈ E− (2)


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

The sentiment lexicon U MM SentiLex learnt from the corpus XS is a |V|×3 ma-
trix, where U MM SentiLex(i,1), U MM SentiLex(i,2) and U MM SentiLex(i,3) are
the positive, negative and neutral valences corresponding to the ith word in vocab-
ulary V . Observe that unlike the method which learns U MM EmoSentiLex, by ag-
gregating word-level emotion scores into sentiment scores, this method learns the
sentiment-class knowledge corresponding to the documents, before learning a word-
sentiment lexicon. We expect this additional layer of supervision, to benefit perfor-
mance, following the findings of earlier research in supervised and unsupervised
sentiment analysis. In the following section we briefly explain our proposed method
to generate U MM SentiLex and U MM EmoLex. Further details about our proposed
method can be found in [1, 2]

4 Mixture Model for Lexicon Generation

In this section we describe our proposed unigram mixture model (UMM) applied
to the task of emotion lexicon (U MM EmoLex) generation. Sentiment lexicon
(U MM SentiLex) generation is a special case of emotion lexicon generation, where
the k emotion classes are reduced to positive and negative classes. Therefore we
continue the presentation for the general case, i.e. U MM EmoLex generation.

We model real-world emotion data to be a mixture of emotion bearing words
and emotion-neutral (background) words. For example consider the tweet going to
Paris this Saturday #elated #joyous, which explicitly connotes emotion joy. How-
ever, the word Saturday is evidently not indicative of joy. Further Paris could be
associated with emotions such as love. Therefore we propose a generative model
which assumes a mixture of two unigram language models to account for such word
mixtures in documents. More formally our generative model is as follows to de-
scribe the generation of documents connoting emotion et :

P(Det ,Z|θet ) =
|Det |

∏
i=1

∏
w∈di

[(1−Zw)λet P(w|θet )

+(Zw)(1−λet )P(w|N)]
c(w,di) (3)

where θet is the emotion language model and N is the background language
model. λet is the mixture parameter and Zw is a binary hidden variable which in-
dicates the language model that generated the word w.

The estimation of parameters θet and Z can be done using expectation maximiza-
tion (EM), which iteratively maximizes the complete data (Det , Z) by alternating
between E-step and M-step. The E and M steps in our case are as follows:
E-step:

P(Zw = 0|Det ,θ
(n)
et ) =

λet P(w|θ
(n)
et )

λet P(w|θ
(n)
et )+(1−λet )P(w|N)

(4)

M-step:


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

P(w|θ (n+1)
θet

) =
∑
|Det |
i=1 P(Zw = 0|Det ,θ

(n)
et )c(w,di)

∑w∈V ∑
|Det |
i=1 P(Zw = 0|Det ,θ

(n)
et )c(w,di)

(5)

where n indicates the EM iteration number. The EM iterations are terminated when
an optimal estimate for the emotion language model θet is obtained. EM is used to
estimate the parameters of the k mixture models corresponding to the emotions in
E. The emotion lexicon U MM EmoLex is learnt by using the k emotion language
models and the background model N as follows:

U MM EmoLex(wi,θe j ) =
P(wi|θ

(n)
e j )

∑
k
t=1[P(wi|θ

(n)
et )]+ P(wi|N)

(6)

U MM EmoLex(wi,N) =
P(wi|N)

∑
k
t=1[P(wi|θ

(n)
et )]+ P(wi|N)

(7)

where k is the number of emotions in the corpus, and U MM EmoLex is a |V|×(k +
1) matrix.

5 Baseline Domain-specific Lexicon Generation Methods

In this section we formally present the other lexicon generation methods proposed in
the literature using latent dirichlet allocation and word-document frequency statis-
tics. These lexicon generation methods can be used to induce emotion and sentiment
lexicons from documents labelled for emotions and sentiments respectively.

5.1 Supervised Latent Dirichlet Allocation based Emotion
Lexicon

Latent Dirichlet Allocation (LDA) [5] is a popular topic detection algorithm which
models documents to exhibit characteristics of multiple topics. In sentiment analysis
LDA is applied to capture the relationships between words and sentiment (positiv-
ity, negativity) in addition to the topics [22, 19]. Similarly in emotion detection,
LDA has been applied in a semi-supervised manner using a minimal set of domain-
independent seed emotion words to learn emotion-relevant topics [39]. However su-
pervised LDA (sLDA) [21] offers a more accurate means to learn emotion-relevant
topics from labelled/weakly-labelled emotion corpora, because the usage of a min-
imal set of seed emotion words, does not guarantee the same level of coverage for
all domains, thereby affecting the accuracy of the topics generated. Accordingly
sLDA can be used to learn topic (emotion) distributions and map these into a word-
emotion lexicon. More formally, let θe1 ,θe2 ,...,θen be the topic distributions learnt
for emotions e1,e2,...,en, then the emotion lexicon is induced as follows:

sLDA EmoLex(w j,en) =
P(w j|θen )

∑
|E|
i=1 P(w j|ei)

(8)


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

where θen is the topic distribution for emotion en obtained from sLDA, where w j is
the jth word in the vocabulary V .
We learn the emotion-aware sentiment lexicons (sLDA EmoSentiLex and sLDA SentiLex)
using sLDA EmoLex following the same process as illustrated in sections 3.1 and
3.2.

5.2 Word-Document Frequency based Emotion Lexicon
Crowd-sourced emotion annotations provided by readers of the documents (e.g
news stories) are used to learn word-emotion lexicon. These emotion annotations
are in the form of numerical ratings, which can be normalized to define a probabil-
ity distribution of emotions on each document. [33] proposed a lexicon generation
method by combining the document-frequency distributions of words and the emo-
tion distributions over documents. Since this method involves modelling of frequen-
cies of words in emotional documents and emotion ratings, we refer to the method as
word-document-frequency (WDF) lexicon. The generation method for the lexicon
can be formally described as follows:

W DF −EmoLex(w j,en) =
∑
|X|
i=1 P(w j|di)rin

∑
|E|
n=1 ∑

|X|
i=1 P(w j|di)rin

(9)

where w j is the jth word in vocabulary V and rin is the normalized emotion rating
of the nth emotion in E on the ith document in the corpus X . Observe that unlike
the sLDA and UMM lexicons, the WDF lexicon requires the emotion labels for the
documents in the form of numerical ratings. As in the case of sLDA based emotion
lexicon we also learn emotion-aware sentiment lexicons (W DF EmoSentiLex and
W DF SentiLex) using W DF EmoLex following the process illustrated in sections
3.1 and 3.2.

6 Evaluation

Our evaluation is a comparative study, of the performance of the standard senti-
ment lexicons, and the proposed emotion corpus based sentiment lexicons through
a variety of sentiment analysis tasks on benchmark Twitter data sets. Significance is
reported using a paired one-tailed t-test using 95% confidence (i.e. with p-value ≤
0.05). Observe that in all our experimental results, the best performing methods are
highlighted in bold.

6.1 Evaluation Tasks
Our evaluation includes the following sentiment analysis tasks.

1. Sentiment intensity prediction: Given a collection of words/phrases extracted
from sentiment bearing tweets, the objective is to predict a sentiment intensity
score for each word/phrase and arrange them in decreasing order of intensity.


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

The predictions are validated against a ranking given by humans. Formally, given
a phrase P, the sentiment intensity score for the phrase is calculated as follows:

SentimentIntensity(P) = ∑
w∈P

Log
(

Lex(w,+)
Lex(w,−)

)
×count(w,P) (10)

where w is a word in the phrase P, count(w,P) is the number of times w appears in
P. Lex(w,+),Lex(w,−) are the positive and negative valences for the word w in a
lexicon. Some lexicons offer the sentiment intensity scores (e.g. SenticNet, S140
lexicon), in which case we use them directly. The aforementioned computation
applies to the UMM based lexicons like Sentilex and S140-UMM lexicon.

2. Sentiment classification: Given a collection of documents (tweets), the objective
is to classify them into positive and negative classes. The predictions are vali-
dated against human judgements. Formally, given a document d, the sentiment
class is predicted using a lexicon as follows:

d[+] = ∑
w∈d

Lex(w,+)×count(w,d) (11)

where d[+] is the positive intensity of d. Similarly d[−] indicates the negative
intensity of d. Finally the sentiment class of d is determined as follows:

Sentiment(d) =
{

positive if d[+] > d[−]
negative if d[−] > d[+] (12)

6.2 Datasets
We use four benchmark data sets in our evaluation. Note that the emotion corpus
is used in two different ways to learn sentiment lexicons (refer sections 3 and 4).
Further the S140 training data is used to learn a sentiment lexicon using the pro-
posed method (refer section 4). The remaining data sets are used for evaluation. We
expect our evaluation to test the transferability of each of the lexicons, given that
the training and test data are not always from the same corpus, albeit from similar
genre.

6.2.1 Emotion Dataset

A collection of 0.28 million emotional tweets crawled from Twitter streaming
API1 using emotion hashtags provided in [37]. The emotion labels in the data set
correspond to Parrot’s [29] primary emotions and were obtained through distant-
supervision2. Parrot’s emotion theory identifies an equal number of positive and
negative emotions. Therefore we expect the sentiment lexicons learnt on this corpus
to be able to mine both positive and negative sentiment in the test corpora.

1 https://dev.twitter.com/streaming/public
2 http://www.gabormelli.com/RKB/Distant-Supervision-Learning-Algorithm


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

6.2.2 S140 Dataset

A collection of 1.6 million (0.8 million positive and 0.8 million negative) sentiment
bearing tweets harnessed by Go et.al [13] using the Twitter API. Further the data set
also contains a collection of 359 (182 positive and 177 negative) manually annotated
tweets. We generate a sentiment lexicon using the proposed method in section 4 on
the 1.6 million tweets and compare it with the S140 lexicon [24].

6.2.3 SemEval-2013 Dataset

A collection of 3430 (2587 positive and 843 negative) tweets hand-labelled for sen-
timent using Amazon Mechanical Turk [27]. Note that unlike the S140 test data,
there is high skewness in the class distributions. Therefore it would be a greater
challenge to transfer the lexicons learnt on the emotion corpus and also those learnt
on the S140 training corpus to sentiment classification.

6.2.4 SemEval-2015 Dataset

A collection of 1315 words/phrases hand-labelled for sentiment intensity scores [34].
A higher score indicates greater positivity. Further the words/phrases are arranged
in decreasing order of positivity. We used this data set to validate the performance
of different lexicons in ranking words/phrases for sentiment.

6.3 Baselines and Metrics
The following different models are used in our comparative study:

1. Resource-based sentiment lexicons SentiWordNet and SenticNet;
2. Corpus-based sentiment lexicons S140 lexicon [24] and NRCHashtag

lexicon [24];
3. Corpus-based sentiment lexicon (S140-UMM lexicon) learnt using the proposed

method on S140 corpus (refer section 4);
4. Corpus-based sentiment lexicons (W DF EmoSentiLex, W DF SentiLex, sLDA EmoSentiLex

and sLDA SentiLex) learnt on the emotion corpus (refer section 6.2.1) using
WDF, sLDA methods (refer sections 5);and

5. Corpus-based sentiment lexicons (U MM EmoSentiLex and U MM SentiLex) learnt
on the emotion corpus (refer section 6.2.1) using the proposed method (refer sec-
tions 3 and 4)

Performance evaluation is done using using Spearman’s rank correlation coefficient
and F-score for sentiment ranking and sentiment classification respectively. F-score
is chosen for the classification task since it measures the performance of an algo-
rithm in terms of both precision and recall.


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

6.4 Results and Analysis
In this section we first analyse the quality of the different lexicon generation meth-
ods visually using word clouds, thereafter we analyse the sentiment ranking results
and the sentiment classification results obtained using the different lexicons.

6.5 Emotion word clouds for Lexicons
In this section we analyse the word-emotion associations learnt by the different lexi-
cons, WDF, sLDA and UMM lexicon from the twitter emotion corpus. This analysis
is particularly interesting, as we expect it to reveal interesting trends that could effect
the performance on the sentiment analysis tasks, whose results are discussed later in
this section. Figures 3 and 4show the most expressive words for emotions fear and
joy (other emotions not shown due to space constraints) identified by WDF, sLDA
and UMM lexicons. It is evident from the figures that all these lexicons capture the
domain-specific vocabulary that is expressed informally. This is very important in
order to be able to effective emotion detection in dynamic domains such as Twitter.

The word clouds presented in the figures are the top 100 words for each emotion,
after removing the common words in English language. We observed that WDF
lexicon is biased towards the majority class (joy here) in the corpus in learning
the word-emotion associations. This done by observing the class level performance
metrics like accuracy and F-score. For example it identified words connoting joy
such as succeed! and Ha! as top anger words, similarly for other emotions. This is
due to the fact that WDF lexicon is designed for emotion rated documents and it is
less effective in capturing word-emotion associations on a corpus that have discrete
emotion labels. On the other hand sLDA lexicon, because of the assumption of its
underlying generative model that documents are a mixture of multiple topics (emo-
tions) learnt better word-emotion associations compared to WDF lexicon. However
sLDA lexicon was not able to discriminate effectively between words that strongly
convey a particular emotion and those that are weakly associated with an emotion.
For example words such as scared, worried and nervous are not well distinguished
from other words for emotion fear and similarly for other emotions. As a result
it was observed in the word clouds for the sLDA lexicon that top words for each
emotion have similar size. This is not desirable since the word-emotion association
scores form an important knowledge resource for sentiment analysis.

It was observed that the proposed UMM lexicon discriminate between strong
and weak words for each emotion effectively. This is very promising, since this
knowledge will be very useful for the sentiment intensity prediction task. Further
UMM is also observed to capture words that are emotion-relevant but are rare. For
example words such as :) and fun! for the emotion joy. We expect this word-level
analysis to help infer useful insights about the performance gains of the proposed
lexicon over the baselines in different sentiment analysis tasks. In the following
section we analyse the performance of the different lexicons in sentiment analysis
tasks.


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

Fig. 3: Top fear words for WDF (top), sLDA (middle) and UMM (bottom) lexicons

6.5.1 Sentiment Ranking

Table 1 summarizes the sentiment ranking results obtained for different lexicons. In
general resource-based lexicons SentiWordNet and SenticNet are outperformed by
all the corpus-based lexicons. This is expected, because the vocabulary coverage of
these lexicons relevant to social media is limited compared to other lexicons. Fur-
thermore, the results also suggest that the sentiment intensity knowledge captured
by the corpus-based lexicons is superior to that of resource-based lexicons.

NRCHashtag lexicon performed significantly better than the remaining base-
lines and the proposed U MM EmoSentiLex. The significant performance differ-
ences between NRCHashtag lexicon and S140 lexicon and NRCHashtag lexicon
and S140-UMM lexicon clearly suggests the superiority of the NRCHashtag cor-
pus over the S140 corpus in learning transferable lexicons for sentiment intensity
prediction. It would be interesting to compare the performance of these lexicons in
the sentiment classification tasks. In the case of emotion-aware sentiment lexicons
learnt using WDF, sLDA we observed that W DF SentiLex and sLDA SentiLex out-
performed their counterparts W DF EmoSentiLex and sLDA EmoSentiLex. This is
expected since the former ones models document-sentiment relationships. Further


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

Fig. 4: Top joy words for WDF (top), sLDA (middle) and UMM (bottom) lexicons

sLDA based lexicons performed better over the WDF ones given that it was bet-
ter able to model word-emotion associations from the documents as illustrated in
section 6.5.

It is extremely promising to see that the proposed lexicons outperform most of
the baselines significantly. Amongst the proposed lexicons, U MM SentiLex per-
formed significantly better than U MM EmoSentiLex. This is not surprising, since
U MM SentiLex has the ability to incorporate the sentiment-class knowledge of the
documents in the learning stage. This exactly follows the findings of earlier research
in supervised and unsupervised sentiment analysis.

6.5.2 Sentiment Classification

Sentiment classification results for the S140 data set are shown in Table 2. Here
unlike in the sentiment intensity prediction task, SentiWordNet demonstrated com-
parable performance with that of corpus-based lexicons. However, SenticNet does
perform the worst amongst all the lexicons. This suggests that SentiWordNet is bet-
ter transferable onto social media compared to SenticNet.


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

Table 1: Sentiment Ranking Results

Method Spearman’s Rank Correlation Coefficient
Emotion-agnostic Sentiment Lexicons (Baselines)

SentiWordNet 0.479
SenticNet 0.425

S140 lexicon 0.506
NRCHashtag lexicon 0.624
S140-UMM-lexicon 0.517
Emotion-aware Sentiment Lexicons (Baselines)
WDF-EmoSentiLex 0.489

WDF-Sentilex 0.497
sLDA-EmoSentiLex 0.493

sLDA-SentiLex 0.514
Emotion-aware Sentiment Lexicons (Proposed Methods)
UMM-EmoSentiLex 0.572

UMM-SentiLex 0.682

The S140 corpus based lexicons significantly outperform NRCHashtag lexicon,
given their advantage to train on a corpus, that is similar to the test set. In the case
of lexicons based on sLDA and WDF we observed similar trends as seen in the sen-
timent intensity prediction task. Overall sLDA SentiLex performed better than the
other sLDA and WDF lexicons given that it has the ability to model word-emotion
associations and document-sentiment relationships more effectively.

However, the proposed lexicon U MM SentiLex recorded the best performance on
this data set. once again the superiority of U MM SentiLex over U MM EmoSentiLex
is evidenced, given its ability to incorporate sentiment-class knowledge of the doc-
uments in the learning stage. The performance improvements of emotion corpus
based sentiment lexicons over a majority of baseline lexicons, clearly suggests that
emotion knowledge when exploited effectively is very useful for sentiment analysis.

Table 3 summarizes the results for different lexicon on the SemEval-2013 data
set. Unlike the previous, this data set has a very skewed class distribution. The im-
pact of this is clearly reflected in the results. Majority of the lexicons recorded strong
performances in classifying positive class documents. Once again SentiWordNet
demonstrated that it is better transferable onto social media compared to SenticNet.

Similar to the previous data set, S140 corpus based lexicons performed better
than NRCHashtag corpus based lexicon. Overall comparison across the evaluation
tasks suggests that S140 corpus based lexicons record better performance in sen-
timent classification, whereas NRCHashtag lexicon records better performance in
sentiment quantification. This offers interesting directions for future work on com-
posing different corpora for learning sentiment lexicons.

In the case of sLDA and WDF based lexicons we observed that they perform
poorly compared to baseline lexicons that are emotion-agnostic. This could be due
to the sensitive nature of these methods to class distributions while learning lexi-


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

Table 2: Sentiment Classification Results on S140 test data set

Method Positive F-score Negative F-score Overall F-score
Emotion-agnostic Sentiment Lexicons (Baselines)

SentiWordNet 69.42 67.60 68.51
SenticNet 59.88 59.84 59.86

S140-lexicon 71.55 69.42 70.48
NRCHashtag-lexicon 66.66 64.75 65.70
S140-UMM-lexicon 75.14 69.36 72.25
Emotion-aware Sentiment Lexicons (Baselines)
WDF-EmoSentiLex 58.57 48.24 53.40

WDF-SentiLex 59.63 50.12 54.87
sLDA-EmoSentiLex 61.68 57.23 59.45

sLDA-SentiLex 62.93 58.11 60.52
Emotion-aware Sentiment Lexicons (Proposed Methods )
UMM-EmoSentiLex 67.51 71.14 69.32

UMM-SentiLex 72.93 74.11 73.52

cons. In general we observed that these two lexicon generation methods were not
able to leverage the emotion corpus effectively to learn sentiment polarity lexicons
suggesting the usefulness of the proposed UMM method which has consistently
recorded best performances in all the sentiment analysis tasks. We highlight our
findings for the UMM lexicon on the SemEval-2013 data set below.

The proposed lexicon U MM EmoSentiLex performed significantly below most
of the lexicons on this data set. We believe the inability to learn the document-
sentiment relationships, coupled with the skewed class distribution characteris-
tics of the data set resulted in such performance degradation. However, our pro-
posed lexicon U MM SentiLex significantly outperformed all the remaining lexi-
cons. The consistent performance of U MM SentiLex in all the evaluation tasks,
strongly evidences the correlation between emotions and sentiments. We believe
that the emotion-sentiment mapping in psychology effectively clusters the emotion
corpus into sentiment classes, thereafter the ability of the UMM model to effectively
capture the word-sentiment relationships resulted in the performance improvements
for U MM SentiLex.

7 Conclusions

In this paper we study the mapping proposed in psychology between emotions and
sentiments, from a computational modelling perspective in order to establish the
role of an emotion corpus for sentiment analysis. By combining a generative uni-
gram mixture model (UMM) with the emotion-sentiment mapping, we propose two
different methods to extract lexicons for Twitter sentiment analysis from an emo-
tion labelled Twitter corpus. Further we also evaluate how the proposed UMM lexi-


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

Table 3: Sentiment Classification Results on SemEval-2013 data set

Method Positive F-score Negative F-score Overall F-score
Emotion-agnostic Sentiment Lexicons (Baselines)

SentiWordNet 80.14 50.38 65.26
SenticNet 54.95 55.94 55.45

S140-lexicon 80.13 57.87 69.00
NRCHashtag-lexicon 80.25 53.98 67.11
S140-UMM-lexicon 78.87 55.85 67.36
Emotion-aware Sentiment Lexicons (Baselines)
WDF-EmoSentiLex 59.83 47.39 53.61

WDF-SentiLex 62.18 59.88 61.03
sLDA-EmoSentiLex 67.84 58.99 63.41

sLDA-SentiLex 73.45 60.02 66.73
Emotion-aware Sentiment Lexicons (Proposed Methods)
UMM-EmoSentiLex 64.51 48.37 56.44

UMM-SentiLex 83.06 60.98 72.02

con generation method fares in comparison with other automatic lexicon extraction
methods proposed using supervised latent dirichlet allocation (sLDA) and word-
document frequency statistics (WDF) in learning emotion-aware sentiment polarity
lexicons. We comparatively evaluate the quality of the proposed emotion-aware sen-
timent lexicons, those generated using sLDA, WDF and standard sentiment lexicons
that are agnostic to emotion knowledge through a variety of sentiment analysis tasks
on benchmark Twitter data sets. Our experiments confirm that the proposed senti-
ment lexicons, yield significant improvements over standard lexicons in sentiment
classification and sentiment intensity prediction tasks. It is extremely promising to
see the potential of an emotion corpus as a useful knowledge resource for sentiment
analysis, especially on social media where emotions and sentiments are widely ex-
pressed. Further the cost-effectiveness of the emotion-sentiment mapping to cluster
the emotion corpus into positive, negative classes (0.28 million tweets in a second)
makes it practically possible to adopt large emotion corpora, in order to extract sen-
timent lexicons with improved coverage.

References

1. Bandhakavi, A., Wiratunga, N., Deepak, P., Massie, S.: Generating a word-emotion lexicon
from #emotional tweets. In: Proc of the 3rd Joint Conference on Lexical and Computational
Semantics (*SEM 2014) (2014)

2. Bandhakavi, A., Wiratunga, N., Massie, S., Deepak, P.: Lexicon generation for emotion detec-
tion from text. IEEE Intelligent Systems, January/February (2017)

3. Binali, H., Potdar, V.: Emotion detection state-of -the-art. In: Proc of the CUBE International
Information Technology Conference, pp. 501–507 (2012)

4. Binali H. Potdar, V., Wu, C.: Computational approaches for emotion detection in text. In: 4th
IEEE International Conference on Digital Ecosystems and Technologies DEST (2010)


A.Bandhakavi, N.Wiratunga, S.Massie, and P.Deepak

5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the Journal of machine Learning
research 3, 993–1022 (2003)

6. Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting
on twitter. In: Proc of the 43rd Hawaii International Conference on System Sciences. (2010)

7. Cambria, E., Olsher, D., Rajagopal, D.: Senticnet 3: A common and common-sense knowledge
base for cognition-driven sentiment analysis. In: 28th AAAI conf on Artificial Intelligence,
pp. 1515-1521 (2014)

8. Ekman, P.: An argument for basic emotions. Cognition and Emotion, 6(3), pp. 169-200 (1992)
9. Esuli, A., Baccianella, S., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for

sentiment analysis and opinion mining. In: Proc of LREC (2010)
10. Fellbaum, Christiane: Wordnet and wordnets. In: Encyclopedia of Language and Linguistics

pp. 665–670 (2005)
11. Feng, S., K.Song, D.Wang, G.Yu: A word-emotion mutual reinformcement ranking model

for building sentiment lexicon from massive collection of microblogs. World Wide Web,
18(4):949-967 (2015)

12. Ghazi, D., Inkpen, D., Szpakowicz, S.: Hierarchical approach to emotion recognition and clas-
sification in texts. In: Proc of the 23rd Canadian conference on Advances in Artificial Intelli-
gence (2010)

13. Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision.
Processing, pp 1-6 (2009)

14. Hogenboom, A., Bal, D., Frasincar, F., Bal, M.: Exploiting emoticons in polarity classification
of text. Journal of Web Engineering (2013)

15. Hu, M., Liu., B.: Mining and summarizing customer reviews. In: Proc of the ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining (2004)

16. Hu, X., Tang, J., Gao, H., Liu, H.: Unsupervised sentiment analysis with emotional signals.
In: Proc of the International World Wide Web Conference (WWW) (2013)

17. Jiang, F., Liu, Y.Q., Luan, H.B., Sun, J.S., Zhu, X., Zhang, M., Ma, S.P.: Microblog sentiment
analysis with emoticon space model. Journal of Computer Science and Technology, vol 30(5),
pp 1120-1129 (2015)

18. Jin, X., Wang, Z.: An emotion space model for recognition of emotions in spoken chinese. In:
Proc of the First international conference on Affective Computing and Intelligent Interaction
(2005)

19. Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th
ACM conference on Information and knowledge management, pp. 375–384. ACM (2009)

20. Liu, H., Singh., P.: Conceptnet- a practical commonsense reasoning tool-kit. BT Technology
Journal, 22(4), pp. 211-226 (2004)

21. Mcauliffe, J.D., Blei, D.M.: Supervised topic models. In: Advances in Neural Information
Processing Systems, pp. 121-128 (2007)

22. Mei, Q., Ling, X., Wondra, M., Su, H., Zhai, C.: Topic sentiment mixture: modeling facets
and opinions in weblogs. In: Proceedings of the 16th international conference on World Wide
Web, pp. 171–180. ACM (2007)

23. Mohammad, S.M.: #emotional tweets. In: Proc of The First Joint Conference on Lexical and
Computational Semantics, pp. 246-255 (2012)

24. Mohammad, S.M., Kiritchenko, S., Zhu, X.: Nrc-canada: Building the state-of-the-art in sen-
timent analysis of tweets. In: 7th International Workshop on Semantic Evaluation (SemEval
2013), pp 321-327 (2013)

25. Mohammad, S.M., Turney, P.: Crowdsourcing a word-emotion association lexicon. Computa-
tional Intelligence, 29(3), pp. 436-465 (2013)

26. Munezero, M., Montero, C.S., Sutinen, E., Pajunen, J.: Are they different? affect, feeling,
emotion,sentiment, and opinion detection in text. IEEE Transactions on Affective Computing,
Vol 5 No 2 (2014)

27. Nakov, P., Rosenthal, S., Kozareva, Z., Stoyanov, V., Ritter, A., Wilson, T.: Semeval-2013
task2: Sentiment analysis in twitter. In: Proc of the 7th International Workshop on Semantic
Evaluation (SemEval-2013) (2013)


Emotion-Aware Polarity Lexicons for Twitter Sentiment Analysis

28. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Infor-
mation Retrieval 2(1), 1–135 (2008)

29. Parrott, W.: Emotions in social psychology. Psychology Press, Philadelphia (2001)
30. Plutchik., R.: A general psychoevolutionary theory of emotion. In R. Plutchik & H. Kellerman

(Eds.), Emotion: Theory, research, and experience: Vol. 1., (pp. 3–33) (1980)
31. Poria, S., Gelbukh, A., Cambria, E., Hussain, A., Huang, G.B.: Emosenticspace: A novel

framework for affective common-sense reasoning. Knowledge-Based Systems 69, pp. 108-
123 (2014)

32. Qadir, A., Riloff, E.: Bootstrapped learning of emotion hashtags #hashtags4you. In: the 4th
Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis
(WASSA 2013) (2013)

33. Rao, Y., Lei, J., Wenyin, L., Li, Q., Chen, M.: Building emotional dictionary for sentiment
analysis of online news. World Wide Web, Vol 17, pp. 723-742 (2014)

34. Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S.M., Ritter, A., Stoyanov, V.: Semeval-
2015: Sentiment analysis in twitter. In: Proc of the 9th International Workshop on Semantic
Evaluation (SemEval-2015) (2015)

35. Song, K., Feng, S., Gao, W., Wang, D., Chen, L., Zhang, C.: Build emotion lexicon from
microblogs by combining effects of seed words and emoticons in a hetereogeneous graph. In:
Proc of the 26th ACM Conference on Hypertext & Social Media, pp. 283-292 (2015)

36. Stone, P.J., Dexter, D.C., Marshall, S.S., Daniel, O.M.: The general inquirer: A computer
approach to content analysis. The MIT Press (1966)

37. Wang, W.: Harnessing twitter ”big data” for automatic emotion identification. In: Proc of the
ASE/IEEE International Conference on Social Computing and International Conference on
Privacy, Security, Risk and Trust (2012)

38. Wilson, T., Wiebe, J., Hoffmann., P.: Recognizing contextual polarity in phrase-level sentiment
analysis. In: Proc. of HLT-EMNLP-2005 (2005)

39. Yang, M., Peng, B., Chen, Z., Dingju Zhu, a.K.C.: A topic model for building fine-grained
domain-specific emotion lexicon. In: Proc of the 52nd Annual Meeting of the Assoc for
Computational Linguistics, pp. 421-426 (2014)


	coversheetJournalArticles
	BCS_AI_2016_ExpertSystems_ABandhakavi_Revised.pdf

	OA: GREEN
	OA Logo: 
	AUTHORS: BANDHAKAVI, A., WIRATUNGA, N., MASSIE, S. and DEEPAK, P.
	TITLE: Emotion-aware polarity lexicons for Twitter sentiment analysis.
	YEAR: 2018
	Publisher citation: BANDHAKAVI, A., WIRATUNGA, N., MASSIE, S. and DEEPAK, P. 2018. Emotion-aware polarity lexicons for Twitter sentiment analysis. Expert systems [online], Early View. Available from: https://doi.org/10.1111/exsy.12332
	OpenAIR citation: BANDHAKAVI, A., WIRATUNGA, N., MASSIE, S. and DEEPAK, P. 2018. Emotion-aware polarity lexicons for Twitter sentiment analysis. Expert systems, Early View. Held on OpenAIR [online]. Available from: https://openair.rgu.ac.uk
	Version: AUTHOR ACCEPTED
	Publisher: WILEY
	Series: Expert systems
	ISSN: 0266-4720
	eISSN: 1468-0394
	Set statement: This is the peer reviewed version of the following article: BANDHAKAVI, A., WIRATUNGA, N., MASSIE, S. and DEEPAK, P. 2018. Emotion-aware polarity lexicons for Twitter sentiment analysis. Expert systems [online], Early View, which has been published in final form at https://doi.org/10.1111/exsy.12332. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.
	License: BY-NC 4.0
	License URL: https://creativecommons.org/licenses/by-nc/4.0
	CC Logo: 
		2018-10-25T11:08:40+0100
	OpenAIR at RGU