SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Ribeiro, Filipe N; Araújo, Matheus; Gonçalves, Pollyanna; André Gonçalves, Marcos; Benevenuto, Fabrício

doi:10.1140/epjds/s13688-016-0085-1

EPJ Data Science

Table 1 Overview of the sentence-level methods available in the literature

From: SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Name	Description	L	ML
Emoticons [20]	Messages containing positive/negative emoticons are positive/negative. Messages without emoticons are not classified.	✓
Opinion Lexicon [2]	Focus on Product Reviews. Builds a Lexicon to predict polarity of product features phrases that are summarized to provide an overall score to that product feature.	✓
Opinion Finder (MPQA) [22, 23]	Performs subjectivity analysis trough a framework with lexical analysis former and a machine learning approach latter.	✓	✓
SentiWordNet [24, 25]	Construction of a lexical resource for Opinion Mining based on WordNet [26]. The authors grouped adjectives, nouns, etc. in synonym sets (synsets) and associated three polarity scores (positive, negative and neutral) for each one.	✓	✓
LIWC [7]	An acronym for Linguistic Inquiry and Word Count, LIWC is a text analysis paid tool to evaluate emotional, cognitive, and structural components of a given text. It uses a dictionary with words classified into categories (anxiety, health, leisure, etc.). An updated version was launched in 2015.	✓
Sentiment140 [27]	Sentiment140 (previously known as ‘Twitter Sentiment’) was proposed as an ensemble of three classifiers (Naive Bayes, Maximum Entropy, and SVM) built with a huge amount of tweets containing emoticons collected by the authors. It has been improved and transformed into a paid tool.		✓
SenticNet [28]	Uses dimensionality reduction to infer the polarity of common sense concepts and hence provide a resource for mining opinions from text at a semantic, rather than just syntactic level.	✓
AFINN [29] - a new ANEW	Builds a Twitter based sentiment Lexicon including Internet slangs and obscene words. AFINN can be considered as an expansion of ANEW [30], a dictionary created to provides emotional ratings for English words. ANEW dictionary rates words in terms of pleasure, arousal and dominance.	✓
SO-CAL [31]	Creates a new Lexicon with unigrams (verbs, adverbs, nouns and adjectives) and multi-grams (phrasal verbs and intensifiers) hand ranked with scale +5 (strongly positive) to −5 (strongly negative). Authors also included part of speech processing, negation and intensifiers.	✓
Emoticons DS (Distant Supervision) [32]	Creates a scored lexicon based on a large dataset of tweets. Its based on the frequency each lexicon occurs with positive or negative emotions.	✓
NRC Hashtag [33]	Builds a lexicon dictionary using a Distant Supervised Approach. In a nutshell it uses known hashtags (i.e. #joy, #happy, etc.) to ‘classify’ the tweet. Afterwards, it verifies frequency each specific n-gram occurs in a emotion and calculates its Strong of Association with that emotion.	✓
Pattern.en [34]	Python Programming Package (toolkit) to deal with NLP, Web Mining and Sentiment Analysis. Sentiment analysis is provided through averaging scores from adjectives in the sentence according to a bundle lexicon of adjective.	✓
SASA [35]	Detects public sentiments on Twitter during the 2012 U.S. presidential election. It is based on the statistical model obtained from the classifier Naive Bayes on unigram features. It also explores emoticons and exclamations.		✓
PANAS-t [8]	Detects mood fluctuations of users on Twitter. The method consists of an adapted version (PANAS) Positive Affect Negative Affect Scale [36], well-known method in psychology with a large set of words, each of them associated with one from eleven moods such as surprise, fear, guilt, etc.	✓
Emolex [37]	Builds a general sentiment Lexicon crowdsourcing supported. Each entry lists the association of a token with 8 basic sentiments: joy, sadness, anger, etc. defined by [38]. Proposed Lexicon includes unigrams and bigrams from Macquarie Thesaurus and also words from GI and WordNet.	✓
USent [39]	Infer additional reviews user ratings by performing sentiment analysis (SA) of user comments and integrating its output in a nearest neighbor (NN) model that provides multimedia recommendations over TED talks.	✓	✓
Sentiment140 Lexicon [40]	A lexicon dictionary based on the same dataset used to train the Sentiment140 Method. The lexicon was built in a similar way to [33] but authors used the occurrence of emoticons to classify the tweet as positive or negative. Then, the n-gram score was calculated based on the frequency of occurrence in each class of tweets.	✓
SentiStrength [11]	Builds a lexicon dictionary annotated by humans and improved with the use of Machine Learning.	✓	✓
Stanford Recursive Deep Model [41]	Proposes a model called Recursive Neural Tensor Network (RNTN) that processes all sentences dealing with their structures and compute the interactions between them. This approach is interesting since RNTN take into account the order of words in a sentence, which is ignored in most of methods.	✓	✓
Umigon [18]	Disambiguates tweets using lexicon with heuristics to detect negations plus elongated words and hashtags evaluation.	✓
ANEW_SUB [42]	Another extension of the ANEW dictionary [30] including the most common words from the SubtlexUS corpus [43]. SubtlexUS was an effort to propose a different manner to calculate word frequencies considering film and TV subtitles.	✓
VADER [15]	It is a human-validated sentiment analysis method developed for Twitter and social media contexts. VADER was created from a generalizable, valence-based, human-curated gold standard sentiment lexicon.	✓
Semantria [44]	It is a paid tool that employs multi-level analysis of sentences. Basically it has four levels: part of speech, assignment of previous scores from dictionaries, application of intensifiers and finally machine learning techniques to delivery a final weight to the sentence.	✓	✓

Back to article page