Skip to main content

Table 2 List of 487 features extracted by our framework

From: Early detection of promoted campaigns on social media

Class Feature description No. of features
Network (†) Number of nodes 1
Number of edges 1
(*) Strength distribution 8
(*) In-strength distribution 8
(*) Out-strength distribution 8
(*) Distribution of number of nodes in the connected components 8
Network density of whole and largest connected component 2
Network assortativity of whole and largest connected component 2
Mean shortest path length of the largest connected component 1
User (*) Sender’s follower count 8
(*) Sender’s followee count 8
(*) Sender’s number of favorite tweets 8
(*) Sender’s number of Twitter statuses posted 8
(*) Sender’s number of lists subscribed to 8
(*) Originator’s follower count 8
(*) Originator’s followee count 8
(*) Originator’s number of favorite tweets 8
(*) Originator’s number of Twitter statuses posted 8
(*) Originator’s number of lists subscribed to 8
Timing Number of tweets appeared in a given window 1
(*) Time between two consecutive tweets 8
(*) Time between two consecutive retweets 8
(*) Time between two consecutive mentions 8
Content (*) Number of hashtags in a tweet 8
(*) Number of mentions in a tweet 8
(*) Number of URLs in a tweet 8
(*,**) Frequency of POS tags in a tweet 64
(*,**) Proportion of POS tags in a tweet 64
(*) Number of words in a tweet 8
(*) Entropy of words in a tweet 8
Sentiment (***) Happiness scores of aggregated tweets 2
(***) Valence scores of aggregated tweets 2
(***) Arousal scores of aggregated tweets 2
(***) Dominance scores of single tweets 2
(*) Happiness score of single tweets 8
(*) Valence score of single tweets 8
(*) Arousal score of single tweets 8
(*) Dominance score of single tweets 8
(*) Polarization score of single tweets 8
(*) Entropy of polarization scores of single tweets 8
(*) Positive emoticons entropy of single tweets 8
(*) Negative emoticons entropy of single tweets 8
(*) Emoticons entropy of single tweets 8
(*) Ratio between positive and negative score of single tweets 8
(*) Number of positive emoticons in single tweets 8
(*) Number of negative emoticons in single tweets 8
(*) Total number of emoticons in single tweets 8
Ratio of tweets that contain emoticons 1
  1. † We consider three types of network: retweet, mention, and hashtag co-occurrence networks. The hashtag co-occurrence network is undirected. * Distribution types. For each distribution, the following eight statistics are computed and used as individual features: min, max, median, mean, std. deviation, skewness, kurtosis, and entropy. ** Part-of-Speech (POS) tag. There are eight POS tags: verbs, nuns, adjectives, modal auxiliaries, pre-determiners, interjections, adverbs, and pronouns. *** For each feature we compute mean and std. deviation.
\