Skip to main content

Table 2 List of 487 features extracted by our framework

From: Early detection of promoted campaigns on social media

Class

Feature description

No. of features

Network (†)

Number of nodes

1

Number of edges

1

(*) Strength distribution

8

(*) In-strength distribution

8

(*) Out-strength distribution

8

(*) Distribution of number of nodes in the connected components

8

Network density of whole and largest connected component

2

Network assortativity of whole and largest connected component

2

Mean shortest path length of the largest connected component

1

User

(*) Sender’s follower count

8

(*) Sender’s followee count

8

(*) Sender’s number of favorite tweets

8

(*) Sender’s number of Twitter statuses posted

8

(*) Sender’s number of lists subscribed to

8

(*) Originator’s follower count

8

(*) Originator’s followee count

8

(*) Originator’s number of favorite tweets

8

(*) Originator’s number of Twitter statuses posted

8

(*) Originator’s number of lists subscribed to

8

Timing

Number of tweets appeared in a given window

1

(*) Time between two consecutive tweets

8

(*) Time between two consecutive retweets

8

(*) Time between two consecutive mentions

8

Content

(*) Number of hashtags in a tweet

8

(*) Number of mentions in a tweet

8

(*) Number of URLs in a tweet

8

(*,**) Frequency of POS tags in a tweet

64

(*,**) Proportion of POS tags in a tweet

64

(*) Number of words in a tweet

8

(*) Entropy of words in a tweet

8

Sentiment

(***) Happiness scores of aggregated tweets

2

(***) Valence scores of aggregated tweets

2

(***) Arousal scores of aggregated tweets

2

(***) Dominance scores of single tweets

2

(*) Happiness score of single tweets

8

(*) Valence score of single tweets

8

(*) Arousal score of single tweets

8

(*) Dominance score of single tweets

8

(*) Polarization score of single tweets

8

(*) Entropy of polarization scores of single tweets

8

(*) Positive emoticons entropy of single tweets

8

(*) Negative emoticons entropy of single tweets

8

(*) Emoticons entropy of single tweets

8

(*) Ratio between positive and negative score of single tweets

8

(*) Number of positive emoticons in single tweets

8

(*) Number of negative emoticons in single tweets

8

(*) Total number of emoticons in single tweets

8

Ratio of tweets that contain emoticons

1

  1. † We consider three types of network: retweet, mention, and hashtag co-occurrence networks. The hashtag co-occurrence network is undirected. * Distribution types. For each distribution, the following eight statistics are computed and used as individual features: min, max, median, mean, std. deviation, skewness, kurtosis, and entropy. ** Part-of-Speech (POS) tag. There are eight POS tags: verbs, nuns, adjectives, modal auxiliaries, pre-determiners, interjections, adverbs, and pronouns. *** For each feature we compute mean and std. deviation.