Skip to main content

Table 2 Overview of the sentence-level methods available in the literature

From: SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Name

Output

Validation

Compared to

Lexicon size

Emoticons

,

-

-

79

Opinion Lexicon

Provides polarities for lexicons

Product Reviews from Amazon and CNet

-

6,787

Opinion Finder (MPQA)

, Objective,

MPQA [45]

Compared to itself in different versions

20,611

SentiWordNet

Provides positive, negative and objective scores for each word (0.0 to 1.0)

-

General Inquirer (GI) [46]

117,658

Sentiment140

, 2,

Their own datasets - 359 tweets (Tweets_STF, presented at Table 3)

Naive Bayes, Maximum Entropy, and SVM classifiers as described in [6]

-

LIWC15

,

-

Their previous dictionary (2001)

4,500

SenticNet

,

Patient Opinions (Unavailable)

SentiStrength [11]

15,000

AFINN

Provides polarity score for lexicons (−5 to 5)

Twitter [47]

OpinonFinder [22], ANEW [30], GI [46] and SentiStrength [11]

2,477

SO-CAL

, 0,

Epinion [48], MPQA [45], Myspace [11],

MPQA [45], GI [46], SentiWordNet [24], ‘Maryland’ Dict [49], Google Generated Dict [50]

9,928

Emoticons DS (Distant Supervision)

Provides polarity score for lexicons

Validation with unlabeled Twitter data [51]

-

1,162,894

NRC Hashtag

Provides polarities for lexicons

Twitter (SemEval-2007 Affective Text Corpus) [52]

WordNet Affect [52]

679,468

Pattern.en

Objective, ,

Product Reviews, but the source was not specified

-

2,973

SASA [35]

, Neutral, Unsure,

‘Political’ tweets labeled by ‘turkers’ (AMT) (unavailable)

-

-

PANAS-t

Provides association for each word with eleven moods (joviality, attentiveness, fear, etc.)

Validation with unlabeled Twitter data [51]

-

50

Emolex

Provides polarities for lexicons

-

Compared with existing gold standard data but it was not specified

141,820

USent

, neu,

Their own dataset - TED talks

Comparison with other multimedia recommendation approaches

MPQA (8,226)/Their own (9,176)

Sentiment140 Lexicon

Provides polarity scores for lexicon

Twitter and SMS from SemEval 2013, task 2 [53]

Other SemEval 2013, task 2 approaches

1,220,176

SentiStrength

, 0,

Their own datasets - Twitter, Youtube, Digg, Myspace, BBC Forums and Runners World

The best of nine Machine Learning techniques for each test

2,698

Stanford Recursive Deep Model

, , neutral, ,

Movie Reviews [54]

Naive Bayes and SVM with bag of words features and bag of bigram features

227,009

Umigon

, Neutral,

Twitter and SMS from SemEval 2013, task 2 [53]

[40]

1,053

ANEW_WKB

Provides ratings for words in terms of Valence, Arousal and Dominance. Results can also be grouped by gender, age and education

-

Compared to similar works, including cross-language studies, by means of correlations between emotional dimensions

13,915

VADER

, (−0.05,…,0.05),

Their own datasets - Twitter, Movie Reviews, Technical Product Reviews, NYT User’s Opinions

GI [46], LIWC, [7], SentiWordNet [24], ANEW [30], SenticNet [55] and some Machine Learning approaches

7,517

LIWC15

,

-

Their previous dictionary (2007)

6,400

Semantria

, neutral,

Not available

Not available

Not available