Emotions in online rumor diffusion

Emotions are regarded as a dominant driver of human behavior, and yet their role in online rumor diffusion is largely unexplored. In this study, we empirically study the extent to which emotions explain the diffusion of online rumors. We analyze a large-scale sample of 107,014 online rumors from Twitter, as well as their cascades. For each rumor, the embedded emotions were measured based on eight so-called basic emotions from Plutchik’s wheel of emotions (i.e., anticipation–surprise, anger–fear, trust–disgust, joy–sadness). We then estimated using a generalized linear regression model how emotions are associated with the spread of online rumors in terms of (1) cascade size, (2) cascade lifetime, and (3) structural virality. Our results suggest that rumors conveying anticipation, anger, and trust generate more reshares, spread over longer time horizons, and become more viral. In contrast, a smaller size, lifetime, and virality is found for surprise, fear, and disgust. We further study how the presence of 24 dyadic emotional interactions (i.e., feelings composed of two emotions) is associated with diffusion dynamics. Here, we find that rumors cascades with high degrees of aggressiveness are larger in size, longer-lived, and more viral. Altogether, emotions embedded in online rumors are important determinants of the spreading dynamics.


Introduction
Social media platforms such as Facebook, Sina Weibo, and Twitter allow users to disseminate content through sharing (e.g., called retweeting in the case of Twitter). As a result, content can go viral and reach a large audience despite that fact that it originated from a single broadcast. To this end, understanding the diffusion of online content is relevant for a number of reasons. Marketers are interested in identifying what makes content go viral, so that marketing content can be designed accordingly [1][2][3][4]. Humanitarian organizations leverage the potential of online diffusion in social media to collect information for effective responses to natural disasters and to inform the wider public [5][6][7]. Public stakeholders are confronted with the diffusion of political content and, by understanding the underlying mechanics, can help prevent the spread of rumors [8][9][10][11].
Previous research has identified several drivers of online diffusion (see Additional file 1 for an overview). These drivers are primarily located in the different characteristics of senders. For instance, senders with a larger follower base (i.e., with more outgoing ties in the network) also reach, on average, a larger audience [12]. Other characteristics of senders are the number of followees (i.e., how many incoming ties a user has [13][14][15]) or their past engagement (i.e., the number of posts or reshares [11]). A different stream of research has examined online diffusion around specific topics (e.g., a specific election [9] or a specific disaster [5][6][7][16][17][18][19]). In this work, we add by studying the role of emotions in the diffusion of online rumors.
Emotions have been established as an important determinant of human behavior in offline behavior [20][21][22]. Emotions typically arise as a response to environmental stimuli that are of relevance to the needs, goals, or concerns of users and, as a consequence, also guide user behavior in online settings [23]. Emotions influence what type of information users seek, what they process, how they remember it, and ultimately what judgments and decisions they derive from it. Emotions are themselves contagious and can spread among people, both offline (i.e., in person) [24] and online (i.e., via social media) [25][26][27][28][29].
Following the above, an important driver of online behavior are emotions embedded in online content. For instance, it was previously confirmed that emotions influence posting and liking activities [30], users' willingness-to-share [1], and actual sharing behavior [2,[31][32][33]. As such, embedded emotions explain, to a large extent, the propensity to share posts, as well as user response time. Here, emotional stimuli such as emotion-laden wording trigger cognitive processing [34], which in turn results in the behavioral response of information sharing [35][36][37]. In particular, emotions embedded in online content also explain the dynamics of online diffusion. For instance, emotions describe different properties of diffusion cascades, such as their size, branching, or lifetime [38][39][40][41]. Especially misinformation relies upon emotions in order to attract attention [11,38,[42][43][44][45][46]. Given the importance of emotions in online behavior, we investigate how emotions are linked to the spread of online rumors.
Hypothesis Emotions embedded in online rumors are associated with the size, lifetime, and structural virality of the cascade.
In this study, we empirically analyze to what extent emotions explain the diffusion of online rumors. For this, we infer the emotions embedded in replies to online rumors through the use of affective computing (see Methods). For each rumor, the degree of emotion is rated along so-called basic emotions. Basic emotions refer to a subset of emotions that are universally recognized across cultures and through which other, more complex emotions can be derived. In this work, we adopt Plutchik's wheel of emotions [22], comprising 8 basic emotions (ANTICIPATION, SURPRISE, ANGER, FEAR, TRUST, DISGUST, JOY, SADNESS). Based on these, we infer 24 dyadic emotional interactions, each representing a more complex emotion composed of two basic emotions (e.g., AGGRESSIVENESS as a combination of ANGER and ANTICIPATION). These emotions are then linked to the spread of online rumors using regression analysis. Thereby, we estimate to what extent emotions embedded in online rumors explain: (1) cascade size, that is, how many reshares a rumor generates; (2) cascade lifetime, that is, how long a rumor is active; and (3) structural virality, that is, how effectively it spreads. The latter, structural virality, provides a quantitative metric [47] aggregating the depth-breadth variation in rumor diffusion.
One work [11] contains summary statistics reporting which emotions are present in online rumors but not how emotions affect sharing. Hence, any statistical claims measuring the emotion effect (= which emotions drive a faster and wider rumor spreading) are precluded. This presents the added value of our work. We measure how emotions are associated with the diffusion dynamics (e.g., TRUST as an emotion is present in only a small portion of rumors but it has a large influence on virality). Because of this, our work is different in several ways: (i) we focus not only on basic emotions but also dyadic emotions, (ii) we infer the emotion effect on diffusion dynamics, and, because of that, (iii) we use a regression analysis as opposed to summary statistics. Therefore, this work is-to the best of our knowledge-the first comprehensive study assessing the link between emotions and the spread of online rumors.
We analyze a large-scale, representative sample of Twitter rumors and their corresponding cascades [11]. Specifically, our data cover the complete time frame from the launch of Twitter in 2006 until (and including) 2017. Altogether, this results in 2189 rumors associated with 107,014 cascades. The sample comprises approx. 3.7 million reshares that originate from almost 3 million different users. Based on the cascades, various control variables are constructed. Specifically, in our regression analysis, we capture time-and rumor-effects through the use of random effects, based on which we control for the heterogeneity among rumors (see Materials and Methods).

Dataset
A rumor is defined as a piece of content that is propagated between users but without confirmation of its veracity. This definition is rooted in social psychology literature [43,48]. For this study, a large-scale dataset comprising of rumor cascades from Twitter [11] was analyzed. The resulting sample comprises all rumors from Twitter between its founding in the year 2006 until (and including) 2017. Ethics approval was obtained from ETH Zurich (2020-N-44). Overall, our sample includes 2189 rumors with a total of N = 107,014 cascades (i.e., some rumor contents were shared as part of multiple but different cascades). The rumors had approx. 3.7 million reshares originating from 3 million users (see [11] for details).

Characteristics of online rumor diffusion
The cascades were then processed as follows in order to generate additional variables. These variables refer to different characteristics of online rumor diffusion and later represent the dependent variables in the regression analysis. For simplicity, we introduce the following notation. We refer to the cascades via j = 1, . . . , N . These belong to i = 1, . . . , 2189 different rumors. Each cascade is a three-tuple T j = (r j , t j0 , R j ), where r j is the root post that corresponds to the original broadcast and where t j0 is its timestamp and R j the set of reshares. A reshare k has a parent p jk and a timestamp t jk , i.e., R j = {(p jk , t jk )} k .
(1) Cascade size: The cascade size counts how many reshares a cascade generated. Formally, it amounts to all reshares plus 1 (for the root), i.e., |R j | + 1.
(2) Cascade lifetime: The cascade lifetime is the timespan during which a rumor cascade was active, thus the elapsed time between the root broadcast and the last reshare. It is calculated via max k t jkt j0 . (3) Structural virality: Structural virality [47] provides an aggregated metric combining the depth and breadth of a cascade. A higher structural virality corresponds to a cascade that is both of great depth and where each reshare generated a large relative number of additional reshares (i.e., a high branching factor). As proposed in [47], structural virality is based on the idea of the Wiener index, i.e., where d j 1 ,j 2 is the shortest path between nodes j 1 and j 2 in the tree T j . Intuitively, structural virality reflects the average distance between all reshares in the graph.

Model variables on heterogeneity between rumor cascades
Model variables x j , concerning the heterogeneity among rumor cascades, were computed as in earlier research [11,12,31,38]. These later act as controls. In our study, controls are (1) account age; (2) a binary dummy representing whether the account is officially labeled as "verified" (= 1 if yes, i.e., Twitter displays a blue badge next to it); (3) the number of followers (outgoing ties); (4) the number of followees (incoming ties); and (5) user engagement, that is, the average number of posts, reshares, and likes relative to the account age as in [11]. These variables reflect that the senders of rumors vary in their social influence. Note that all of the above variables were computed at the level of cascades (which is later our unit of analysis). Additional sources of heterogeneity among rumors are captured via rumor-level random effects.

Computing emotions embedded in online rumors
For all cascades, we measured the emotions embedded in replies to rumor cascades. Here, we distinguish basic emotions, bipolar emotion pairs, and dyadic emotional interactions comprising primary, secondary, tertiary dyads. The computation of the emotions is detailed below (see [22] for further details).
Basic emotions: Basic emotions refer to a subset of emotions that are universally recognized across cultures and through which other, more complex emotions can be derived [20,21]. In our study, Plutchik's wheel of emotions [22] is adopted as it is a common tool in affective computing [49]. It defines 8 basic emotions (see Fig. 1 Our computation follows a dictionary-based approach as in [11]. Dictionary-based approaches are widely used when large-scale analyses of emotions are performed with the objective of explanatory modeling and thus reliable interpretations [38,41]. In our work, the NRC emotion lexicon was used [50], which classifies English words into the 8 basic emotions. For all cascades j, the content of the replies was tokenized and the frequency of dictionary terms per basic emotion was counted, resulting in an 8-dimensional emotion score e j . Afterwards, the vector was normalized to sum to one across basic emotions (i.e., e j = 1 e j 1 e j ). We omit rumor cascades that do not contain any emotional words from the NRC emotion lexicon (since, otherwise, the denominator is not defined). As a result, the 8 emotion dimensions in e j ∈ [0, 1] 8 range from zero to one. Owing to this fact, replies to rumors can embed a combination of multiple emotions (e.g., 40% ANGER and 60% FEAR).  Plutchik's wheel of emotions [22] negative. We calculate a 4-dimensional score φ pairs j that measures the difference between a specific positive emotion and its complement from the set of negative emotions. For example, ANGER-FEAR refers to the difference between ANGER and FEAR.

Dyadic emotional interactions:
Plutchik's wheel of emotions further defines 24 dyadic emotional interactions, which are more complex emotions composed of two basic emotions (see Fig. 1, round lines). The dyadic emotional interactions comprise: 1 Primary dyads that are one petal apart from each other (e.g., AGGRESSIVENESS = ANGER + ANTICIPATION).

Regression analysis
To analyze the role of emotions in online rumor diffusion, we apply a generalized regression model. Regression models are generally regarded as an explanatory approach with the ability to document statistical relationships and, in particular, estimate effect sizes [51]. Furthermore, regression models are widely used to estimate the marginal effect of content on diffusion characteristics [11,31,38,41]. This allows us to later make inferences that test our research hypothesis statistically.
Let y j denote a characteristic of the cascade of interest, namely cascade size, cascade lifetime, or structural virality. We then model y j of the cascade via a two-level generalized hierarchical regression: where level 1 refers to the cascade level and level 2 to the rumor level. The other variables are as follows. The coefficient β captures the marginal effect of emotions. This is later our variable of interest as it measures the contribution of emotions to rumor diffusion. The coefficient γ is used to control for other model variables at the rumor cascade level. Both γ 0 and γ i are assumed to be independent and identically normally distributed with mean zero. Then γ 0 reflects the base diffusion in the sample, while γ i controls for variation at rumor level. Notably, this turns α i into a rumor-specific random effect. The error term ε j is assumed to be independent and identically normally distributed with mean zero. The use of regression analysis is imperative for the scope of our study. The reasons are as follows. (1) Our objective is different from predictive modeling [51], where the focus is on accurate estimates of the outcome variable. Instead, we are concerned with the model logic as it allows us to interpret the model coefficients. (2) Our objective is also different from analyzing summary statistics as in [11]. Summary statistics deal with comparisons across groups and thereby ignore other sources of heterogeneity in the sample. For instance, the summary statistics on rumor emotions in [11] only report which emotions are common but not how emotions are associated with sharing dynamics. This is especially relevant for our research as we expect that some properties of rumor diffusion are also due to the social influence of the sender. Hence, by combining emotions and further controls in a joint regression model, we can isolate the marginal effect of emotions on the diffusion dynamics, which would not be possible with summary statistics.
Later, a regression analysis based on basic emotions is precluded due to multicollinearity (recall that the emotion scores e j sum to one across basic emotions). Instead, the regression analysis is performed using bipolar emotion pairs φ . For the latter, we fit 12 separate models, i.e., one for each pair among the emotional dyads, due to linear dependencies between the dyads.
In our implementation, the estimator depends on the distribution of y j as follows: 1 Cascade size is modeled via a negative binomial regression with log-transformation. The reason is that cascade size denotes count data with overdispersion (i.e., variance larger than the mean). 2 Cascade lifetime is first log-transformed and then modeled via a normal distribution. This is consistent with previous research assuming a log-normal distribution for response times [12]. 3 Structural virality is modeled via a gamma regression with a log-link. This allows us to account for a skewed distribution of continuous, non-negative variables. All estimations are conducted based on the R package lme4. Before estimation, all model variables are z-standardized. Owing to this, the regression coefficients quantify changes in the dependent variable in standard deviations. This is beneficial as it allows us to compare the estimated coefficients across emotions in a straightforward manner.

Summary statistics
The diffusion dynamics in our data are as follows. Figure 2 compares cascade size, lifetime, and structural virality via complementary cumulative distribution functions (CCDF). On average, a rumor cascade reaches 31.95 users and has a lifetime of 123.18 hours. The mean structural virality is 1.26.
Basic emotions: Fig. 3 plots the CCDFs for each of the eight basic emotions, while Fig. 4 reports the relative proportion of emotional intensity averaged over all rumors. We find that a large proportion of rumors embed DISGUST and SURPRISE, whereas comparatively few rumors embed JOY and SADNESS. Evidently, rumors embed more ANGER (relative share of 12.34%) than FEAR (10.74%), more SURPRISE (16.44%) than ANTICIPATION (14.23%), more DISGUST (23.58%) than TRUST (9.05%), and more JOY (7.39%) than SADNESS (6.23%). Overall,   Fig. 5 shows the distribution of the dyadic emotional interactions. For the primary emotion dyads, we find that a large proportion of rumors embed CONTEMPT and REMORSE, whereas fewer rumors embed LOVE and SUBMISSION. For the secondary and tertiary emotion dyads, we find that many rumor cascades embed UN-BELIEF and SHAME. In contrast, only a relatively small proportion of rumors embed DESPAIR and PESSIMISM.

Dyadic Emotional Interactions:
Note that the above summary statistics only report the relative frequency of emotions but do not allow one to draw conclusions regarding how users respond to emotions. This is studied in the following regression analyses.

Regression results from bipolar emotion pairs
In the following, we report results for the bipolar emotion pairs φ pairs j . We use regression analysis to explain different characteristics of cascades based on the bipolar emotion pairs. The parameter estimates in Fig. 6 show that the 8 basic emotions are important determinants of the spreading dynamics of rumors. Across all dependent variables, we find coefficients that are positive and statistically significant for the ANTICIPATION-SURPRISE, ANGER-FEAR, and TRUST-DISGUST dimensions. Hence, rumors are estimated to diffuse more pronouncedly when embedding positive emotions. For instance,  The predicted marginal effects for the bipolar emotion pairs are shown in Fig. 7. Rumors embedding ANTICIPATION, ANGER, and TRUST generate more reshares, spread over a longer time horizon, and become more viral. The coefficient for the JOY-SADNESS emotion pair is not significant.
Our regression model controls for heterogeneity in users' social influence. The corresponding estimates are omitted for the sake of brevity (their findings have been discussed elsewhere, e.g., in [31]). In short, rumor cascades initiated from accounts that are verified and younger are linked to a larger, longer, and more viral spread. Similar relationships are observed for users exhibiting a higher engagement level and a greater number of followers. In contrast, a higher number of followees is negatively associated with the size, lifetime, and structural virality of a cascade.
We calculated the pseudo-R 2 for each model, resulting in relatively high values of 0.64 for cascade size, 0.43 for cascade lifetime, and 0.31 for structural virality. Evidently, the model variables explain the variation in the dependent variables to a large extent. Furthermore, a visual inspection of the actual vs. fitted plot and goodness-of-fit tests indicate that the models are well specified. This is also supported when considering the differences between the AIC models for individual models estimated with/without emotion variables. For each dependent variable, the difference is greater than the threshold [52] of 10 (difference in cascade size: 226.16; lifetime: 52.22; structural virality: 121.03), indicating strong support for the corresponding candidate models. Therefore, the inclusion of the emotion variables in the regression model is to be preferred.

Regression results from dyadic emotional interactions
We now study how the presence of 24 dyadic emotional interactions is associated with the diffusion dynamics of online rumors. For this purpose, we employ the previous regression model, but this time include the emotion variables φ primary j , φ secondary j , and φ tertiary j . Figure 8 shows the predicted marginal effects for the 8 primary, 8 secondary, and 8 tertiary dyadic emotional interactions.
Primary dyadic emotional interactions: Rumor cascades with higher values of AGGRES-SIVENESS, LOVE, OPTIMISM are larger in size, longer-lived, and more viral. We observe no statistically significant effect for the SUBMISSION-CONTEMPT pair. Overall, the largest positive association is observed for AGGRESSIVENESS (i.e., the combination of ANTICIPATION and ANGER). An increase of one standard deviation in this dimension is linked to a 19.18% increase in the cascade size, an 8.33% increase in the cascade lifetime, and a 1.69% increase in structural virality.

Secondary dyadic emotional interactions: Rumor cascades with higher values of HOPE vs.
UNBELIEF generate more reshares, spread over a longer time horizon, and become more viral. We further find that rumor cascades embedding GUILT, and DESPAIR are negatively associated with the size, lifetime, and structural virality of a cascade. The CURIOSITY-CYNICISM pair is not statistically significant at common statistical significance levels.
Tertiary dyadic emotional interactions: Rumor cascades with higher values of ANXIETY are larger in size, longer-lived, and more viral. We also find a larger size, lifetime, and virality for rumor cascades embedding high levels of DOMINANCE, PESSIMISM, and ANXIETY. We find no statistically significant effect for the SENTIMENTALITY-MORBIDNESS pair.
The control variables tend in a similar direction as in the analysis of the basic emotions. Again, the difference in AIC (comparing the model with and without emotions) is above the common threshold of 10 [52]. Therefore, the models that include emotions are to be preferred.

Sensitivity across rumor topics
Our empirical analysis is based on a large-scale dataset with Twitter rumors across varying topics. We now study topic-specific variations. For this purpose, we employ the topic categorization from [11], which classifies Twitter rumors into topics. Here, we focus on the topics Politics, Business, and Science given their high relevance for society. Note that the topic Science is broadly defined and also comprises related topics such as healthrelated rumors. For each of the three topics, we generate a subset of the data and reestimate our models. The results are visualized in Fig. 9. We find that emotions explain Figure 9 Standardized parameter estimates and 95% confidence intervals for different subsets of rumors filtered by topic differences in cascade size, cascade lifetime, and structural virality at a statistically significant level for the topics Politics and Business. In contrast, we find mixed results for Science. These results are in line with existing literature. For example, [31] find a pronounced role of political content in social media sharing. The authors argue that political topics are more controversial and thus attract more attention, which itself influences sharing behavior.

Model checks
We conducted a series of additional model checks that contribute to the robustness of our findings. First, we followed common practice in regression analysis and checked that variance inflation factors as an indicator of multicollinearity were below five [53]. This check led to the desired outcome. Second, we controlled for year-level time effects (i.e., via clustered standard errors and different study horizons) in addition to rumor-level random effects that are already included in our regression model. We obtained conclusive findings. Third, we controlled for non-linear relationships via quadratic terms. In all cases, our findings were supported.

Validation of emotion scores
Our results rely on the validity of dictionaries to extract emotions from online rumors. To check how perceived emotions in rumors align with the dictionary-based emotions, we conducted a survey using the online survey platform Prolific (https://www.prolific.co/). We asked n = 7 participants (English native speakers) to rate the presence of the eight basic emotions on a Likert scale from -3 to 3 (here: -3 indicates no emotion present while 3 refers to a high degree of emotion present) for a set of 100 randomly sampled rumors. As shown in Table 1, the participants exhibited a statistically significant interrater agreement according to Kendall's W for each of the 8 basic emotions (p < 0.01).
Overall, when aggregating across all 8 basic emotions, the correlation between the dictionary-based emotion scores and human annotations is ρ = 0.17 (p < 0.01) and thus statistically significant at common significance thresholds. This demonstrates that dictionaries are able to capture emotions in online rumors.

Negation handling
We performed negation scope detection [54,55] to analyze the robustness to how negations (e.g., "not, " "no") are handled by the dictionary approach. For example, phrases like "I am surprised" and "I am not surprised" contain the same number of emotional words but convey different emotions to the reader. We analyzed emotional words that are negated by surrounding negation words as follows: (i) We searched for negations using a predefined list of negation words. Here, we used the list of negations from the R package sentimentr. (ii) We recalculated the emotion scores by counting all emotional words in the neighborhood of the negation word as belonging to the opposite emotional dimension (e.g., Joy = Joy + Sadness negated ). The neighborhood is set to 5 words before and 2 words after the negation. We then compared the emotion scores with negation handling to the values obtained without negation handling. As a result, we found that merely 5.58% of the emotional words in rumors are affected by negations (i.e., lie within negation scopes). Furthermore, the emotion scores with negation handling are highly correlated with the emotion scores without negation handling (ρ > 0.9). Altogether, this implies that our analysis and findings are robust to negations.

Discussion
In this work, we provided a large-scale study of emotions in online rumor diffusion. For this purpose, 2189 rumors from Twitter with approx. 3.7 million reshares were analyzed with regard to the embedded emotions. Overall, we found that negative emotions are frequently embedded in rumors. Especially frequent are DISGUST (relative share of 23.58%) and SURPRISE (16.44%). (2) The relationship between emotions and the structure of cascade is statistically significant at common significance levels for almost all emotions under study. (3) Rumors embedding ANTICIPATION, ANGER, and, TRUST are estimated to reach a significantly larger number of individuals and diffuse significantly longer and more virally. Interestingly, while negative emotions are more often embedded in rumors, positive emotions are particularly relevant for explaining the diffusion dynamics. (4) A particularly large effect of emotions on the diffusion characteristics is found for AGGRESSIVENESS (which is a derived emotion composed of ANTICIPATION and ANGER). A one standard deviation higher level of AGGRESSIVENESS is predicted to generate 19.18% more reshares, to be active for 8.33% longer, and to spread 1.69% more virally. Overall, our study establishes emotions as important determinants that describe the spread of online rumor.
Our results contribute to the understanding of online rumor diffusion. As shown by our analysis, emotions are important determinants in explaining the structure of rumor cascades, specifically how many users are involved, the active lifespan and, to a lesser extent, structural virality. The findings are consistent across basic emotions and also dyadic emotion interaction (primary, secondary, tertiary). In addition, our results suggest considerable heterogeneity in the role of emotions. Strong effects are found for most basic emotions (ANTICIPATION, SURPRISE, ANGER, FEAR, TRUST, DISGUST), albeit with the exception of JOY and SADNESS. Similar patterns are observed when studying more complex (derived) emotions. Here, the largest estimated effect size is associated with AGGRESSIVENESS. A one standard deviation higher level of AGGRESSIVENESS is predicted to generate 19.18% more reshares, cascade that are 8.33% longer, and a 1.69% increase in structural virality. Thereby, we reveal AGGRESSIVENESS as a dominant driver of rumor diffusion.
Our work also expands upon rumor theory from offline settings. Offline rumors have a higher chance of dissemination when conveying anxiety [56] and, in particular, negative emotions [42,43]. However, the underlying evidence stems from offline rumors rather than online rumors. Our work adds in two ways: First, we study the role of emotions in the diffusion of online rumors. While rumor diffusion in offline settings is more pronounced for negative emotions, we observe the opposite for online rumors, for which positive emotions appear more influential. Second, we not only compare positive vs. negative emotions but perform a granular study across primary, secondary, and tertiary emotional dyadic interactions. This provides rich findings on the heterogeneity of emotion effects. As such, we confirm that ANXIETY is an important driver for rumor diffusion not only in offline but also in online settings. However, further emotions are also relevant: a particularly pronounced role is found with regard to AGGRESSIVENESS. To the best of our knowledge, the importance of AGGRESSIVENESS in rumor diffusion was previously overlooked.
In our study, inferences were made based on data from Twitter. Twitter has a wide popularity with more than 300 million active users. In addition, it plays an important part in rumor diffusion due to its influential role in the political discourse [10]. This makes our findings directly relevant to both social media platforms and, in particular, public stakeholders. For the same reason, established procedures were followed when compiling the data [11], as this ensures that findings are drawn from a realistic, large-scale dataset of Twitter rumors. To the best of our knowledge, our work is the first statistical analysis linking emotions to online rumor diffusion.
As with other studies, ours is subject to limitations that provide opportunities for future research. First, this study is based on observational inferences, while we leave the extension to (quasi-)experimental settings, and thus causal inferences, to future work. Nevertheless, our study design ensures that many potential confounding factors can be ruled out. This is because of the temporal order (i.e., the emotion-laden wording precedes the actual cascade) and the fact that further sources of variability among rumors are captured through rumor-level random effects. Second, our study employs statistical inferences that provide explanatory insights. This allows us to quantify the marginal contribution of emotions to online rumor diffusion. A different objective is to use emotions for predictive modeling, which is discussed elsewhere [57][58][59][60].
Our work entails several implications. It emphasizes the necessity of considering emotions when studying rumor diffusion. Emotions are also relevant in practice, particularly for social media platforms. To counter the proliferation of online rumors, social media platforms should seek solutions, based on which emotions can be actively managed. Our study also encourages a granular investigation of emotions for related research questions, whereby not only basic emotions but also derived emotions are considered. Such granular analyses are comparatively more challenging in lab experiments; however, a remedy is offered by computational social science based on which large-scale datasets from online behavior can be mined.