Skip to main content
  • Regular article
  • Open access
  • Published:

Quantifying echo chamber effects in information spreading over political communication networks

Abstract

Echo chambers in online social networks, in which users prefer to interact only with ideologically-aligned peers, are believed to facilitate misinformation spreading and contribute to radicalize political discourse. In this paper, we gauge the effects of echo chambers in information spreading phenomena over political communication networks. Mining 12 million Twitter messages, we reconstruct a network in which users interchange opinions related to the impeachment of the former Brazilian President Dilma Rousseff. We define a continuous political leaning parameter, independent of the network’s structure, that allows to quantify the presence of echo chambers in the strongly connected component of the network. These are reflected in two well-separated communities of similar sizes with opposite views of the impeachment process. By means of simple spreading models, we show that the capability of users in propagating the content they produce, measured by the associated spreading capacity, strongly depends on their attitude. Users expressing pro-impeachment leanings are capable to transmit information, on average, to a larger audience than users expressing anti-impeachment leanings. Furthermore, the users’ spreading capacity is correlated to the diversity, in terms of political position, of the audience reached. Our method can be exploited to identify the presence of echo chambers and their effects across different contexts and shed light upon the mechanisms allowing to break echo chambers.

1 Introduction

Online social networks in which users can be both consumers and producers of content, such as Twitter or Facebook, provide means to exchange information in an almost instantaneous, inexpensive, and not mediated way, forming a substrate for the spread of information with unprecedented capabilities. These new channels of communication have enormously altered the way in which we take decisions, form political opinions, align in front of different issues, or choose between the adoption of different technological options [1]. Such online communication networks are orders of magnitude larger than those classically available in social sciences [2], making it possible to perform measurements and experiments that have led to the definition of a new computational social science [3].

One of the characteristic features of online communication networks is their marked degree of homophily. That is, individuals prefer to interact with others similar to them, or who share the same views and orientations [4,5,6]. Homophily leads to a natural polarization of societies into groups with different perspectives, that leave fingerprints in the online realm. These digital traces provide researchers with large-scale data sets for the study of polarization in different contexts, such as the US and French presidential elections [7], secular vs. Islamist discussions during the 2011 Egyptian revolution [8, 9], or the 15M movement of 2011 in Spain [10]. Political orientation, in particular, has been shown to drive the segregation of online communication networks into separated communities [11, 12]. The presence of these clusters formed by users with a homogeneous content production and diffusion has been named echo chambers [13], referring to the situation in which one’s beliefs are reinforced due to repeated interactions with individuals sharing the same points of view [14]. Echo chambers have been shown to pervade the offline realm [15], to be related to the spreading of misinformation [16, 17], or the development of ideological radicalism [18]. Recent studies, however, have challenged the impact of echo chambers and partisan segregation in communication networks over online social media [19, 20].

This novel debate calls for a quantitative analysis aimed at identifying the impact of polarization over the diffusion of information. In this paper, we contribute to this endeavor by quantifying the political leaning of users in the description of the associated echo chambers, and the effects on information spreading processes running on top of them. To this aim, we reconstruct a political communication (PC) network, in which individuals exchange messages related to the impeachment process of the former Brazilian President Dilma Rousseff, over the social microblogging platform Twitter. We collected over 12 million tweets from half million users, in a time window of 9 months, covering the main events related to the impeachment process and related street protests. The political orientation of users was inferred by means of a hand-tagged analysis of the hashtags adopted in the messages, which are assigned with anti-impeachment, pro-impeachment, or neutral leanings.

The topological analysis of the resulting static PC network reveals clusters of individuals sharing similar opinions, defining the presence of echo chambers. We gauge the impact of these echo chambers over information spreading by means of simple spreading models, characterizing the efficiency of single users to disseminate information, or spreading capacity. Differently from previous studies, we characterize information diffusion by taking into account the full temporal evolution of the social interactions, represented in terms of a temporal network [21, 22]. This ensures that the spreading process respects the communication dynamics. Our analysis shows that the spreading capacity of users is strongly correlated with their political orientation: information sent by pro-impeachment individuals spreads throughout the network much better than messages sent by other users. Furthermore, by analyzing the composition of the audience reached, we discover that users with larger spreading capacity are able to reach individuals with more diverse leanings, actually escaping their echo chamber.

2 Polarization in political communication networks

In Twitter, users post real-time short messages (tweets), sometimes annotated with hashtags indicating the topic of the message, that are broadcast to the community of their followers. A user can also transmit (retweet) messages from other users, forwarding it to its own followers, as a way to endorse its content. Analysis of retweets (RTs) have been used to study viral propagation of information in several contexts [23,24,25]. However, RTs do not involve an explicit effort of content production and do not convey a specific communication target. For this reason, here we discard RTs from our analysis and focus on tweets that include an explicit mention to another user, with the purpose of establishing or continuing a discussion on some topic, carrying even personal messages [26]. This choice allows us to single out only actual social interactions between users, to reconstruct a communication network in which people actually exchange information, discuss, and form their opinion reacting in real time to ongoing political events.

As an example of strongly polarized political discussion, we focus on the debate ensuing the impeachment process of the former Brazilian President Dilma Rousseff, taking place during 2016. Tweets related to the impeachment process were gathered by setting a specific filter for public tweets containing selected keywords. The keyword list was kept up to date as new trending topics continuously appeared on Twitter, see Supplementary Information (SI) (Additional file 1). Furthermore, the full dynamics of social interactions was taken into account by including the real timing of tweets in a temporal network representation [22]. This ensures that information diffusion over the resulting temporal PC network follows time-respecting paths, which are expected to have an effect in slowing down or speeding up the spreading dynamics [27]. From this temporal network representation, a static aggregated, directed, weighted network [28] was constructed, in which a directed link from node i to node j indicates a message sent from user i to user j. The associated weight \(W_{ij}\) represents the number of tweets from i to j. Note that, while we keep the temporal evolution of the social interactions when addressing information spreading dynamics, we measure polarization over the aggregated, static network representation.

Twitter is known to be populated by social bots, that contribute to the spreading of misinformation and poison political debate [29]. Recent studies revealed that while bots tend to interact with humans, e.g. by targeting influential users, the opposite behavior, interactions from humans toward bots, are way less frequent [30, 31]. Thus, once reconstructed the aggregated network, we extracted its largest strongly connected component (SCC) [28] to possibly discard social bots and ensure that only real social interactions between users are considered. Our analysis is restricted only to the set of individuals composing this SCC. This choice comes at the cost of greatly reducing the network size (almost by 90%), but it ensures that each user can be both source and destination of information content. In this way, information transfer is in principle possible between any pair of users, and it is possible to single out the impact of the network’s dynamics. In Table 1 we present a summary of the main topological properties of the PC network and its SCC. See Methods and SI for a detailed explanation of the data set collection.

Table 1 Main properties of aggregated PC network and its largest strongly connected component (SCC): number of users N, with overall positive (negative) values of political position \(N_{+}\) (\(N_{-}\)), total number of interactions W, and average out-degree \(\langle k _{\textrm{out}} \rangle\). See Table S8 in SI for the PC network obtained from a different hashtag classification

Tweets can carry different political leanings, that can be characterized by the hashtags used. We assign to each tweet t a leaning, \(l_{t} = \{-1, 0, +1\}\), corresponding to a pro-impeachment, neutral, or anti-impeachment leaning, respectively, the second one meaning that a hashtag can be used in the other two contexts. For a given user i, that has sent a number \(a_{i}\) of tweets (defined as his/her activity), we can associate a time-ordered set of leanings \(\mathcal{L}_{i} = \{ l_{1}, l_{2}, \ldots, l_{a _{i}-1}, l_{a_{i}} \}\), and define his/her average leaning, or political position \(P_{i}\), as

$$ P_{i} \equiv\frac{\sum_{t=1}^{a_{i}} l_{t}}{a_{i}}, $$
(1)

which is bounded in the interval \([-1, +1]\). This definition permits to characterize a user’s political position as a continuous variable, allowing to discern different degrees of orientation, in opposition to most common binary measures. Since such definition crucially depends on the hashtag classification, we checked the robustness of our results by reconstructing also a PC network based on a different classification of neutral hashtags. See Methods and SI for details.

In Fig. 1(a) we plot the distribution of the political position of users, showing that they are clearly split into two groups with opposite orientations, while a few users show neutral position (\(P \sim0\)). Interestingly, this distribution is strongly asymmetric with respect to \(P=0\): For \(P >0\) the great majority of users have extreme position \(P \simeq+1\), while for \(P<0\) more users have milder values of P. The number of users with overall positive (\(N_{+}\)) and negative (\(N_{-}\)) values of political leaning are, however, similar, see Table 1. The average leaning of a user is inherently correlated with his/her activity. In a scenario in which users send tweets of opposite leanings with the same probability, the political leaning variable would follow a binomial distribution, and the expected average leaning would decrease with activity. Figure 1(b) shows that the correlation between average leaning and activity is far from being driven by a random process: more active users are also more extreme. Interestingly, pro-impeachment users with the largest average activity have \(P \sim-0.75\), while the activity of anti-impeachment users is almost constant for \(0 < P < 0.75\), and reaches a maximum for \(P\sim1\).

Figure 1
figure 1

(a) Number of users as a function of the political position P. (b) Average activity as a function of P. Only users with activity \(a \ge10\) in the SCC are considered for (a) and (b). (c) Visualization of the time-aggregated representation of the PC network, formed by \(N = 31{,}412\) users in the SCC. The size of nodes increases (non-linearly) with their degree. Colors represent political position, as defined by (1), blue for pro-, red for anti-impeachment, and white for neutral average leaning. (d) Community size and average political position of different communities identified by the Louvain algorithm

Figure 1(c) shows a visualization of the time-aggregated PC network, in which users are color-coded according to their average leaning. Two communities with opposite leanings are clearly visible in the network, while users with neutral position are found more frequently bridging the two groups. One can quantify this observation by identifying the community structure [32] as obtained by means of the Louvain algorithm [33]. In Fig. 1(d) we plot the average leaning and size of the different communities found. We show that the PC network is characterized by two large communities, both with approximately 104 users and opposite leaning with similar absolute values, \(P_{+}\approx0.82\) and \(P_{-}\approx-0.70\). However, pro-impeachment users also form other communities of relevant sizes with more moderate average leaning, while users with strong anti-impeachment leanings essentially belong to a single community. See SI for more details.

3 Topological evidence of echo chambers

One can quantify the presence of echo chambers by relating the political position of a user with the leaning of the tweets he/she receives, as well as with the leaning of his/her neighbors. In politics, echo chambers are characterized by users sharing similar opinions and exchanging messages with similar political views [13]. This translates, at the topological level, into a node i with a given political position \(P_{i}\) connected with nodes with a position close to \(P_{i}\), and receiving with higher probability messages with similar average leaning \(P_{i}\). In order to quantify these insights, we define, for each user i, the average position of incoming tweets, \(P_{i}^{\textrm{IN}}\), by applying (1) to the set of tweets from any user \(j \neq i\) mentioning user i. Analogously, the average position of the nearest neighbors, or successors, of user i, \(P_{i}^{\textrm{NN}}\), can be defined as \(P_{i}^{\textrm{NN}} \equiv\sum_{j} A_{ij} P_{j} / k_{\mathrm{out}, i}\), where \(A_{ij}\) is the adjacency matrix of the integrated PC network, \(A_{ij}=1\) if there is a link from node i to node j, \(A_{ij}=0\) otherwise, and \(k_{\mathrm{out}, i} = \sum_{j}A_{ij}\) is the out-degree of node i.

Figure 2 shows the correlation between the political position of a user i and (a) the position of his/her nearest neighbors, \(P_{i}^{\textrm{NN}}\), and (b) the average leaning of received tweets, \(P_{i}^{\textrm{IN}}\). Both plots are color-coded contour maps, representing the number of users in the phase space \((P,P^{\textrm{NN}})\) or \((P,P^{\textrm{IN}})\): the lighter the area in the map, the larger the density of users in that area. Figure 2 shows a strong correlation between the position of a user and the average position of both his/her nearest neighbors and the received tweets. Similar results are found for Fig. 2(a) when considering predecessors as nearest neighbors, see Fig. S11 in the SI. The Pearson correlation coefficient is \(r=0.89\) for \((P,P^{\textrm{NN}})\) and \(r=0.80\) for \((P,P^{\textrm{IN}})\), both statistically significant with a p-value \(p < 10^{-6}\). These topological properties of the PC network confirm the presence of echo chambers: users expressing both pro- and anti-impeachment leaning are more likely to send/receive messages to/from users that share their political opinion.

Figure 2
figure 2

Contour maps for the (a) average political position of the nearest-neighbor \(P^{\text{NN}}\) and (b) average leaning of tweets received, \(P^{\text{IN}}\) against the average political position of a user P. Colors represent the density of users: the lighter the larger the number of users. Probability distribution of P, \(P^{\text{NN}}\), and \(P^{\text{IN}}\) are plotted in the axes. Only users with activity \(a\ge10\) (corresponding to 14813 users) are considered

Figure 2, however, also reveals that the densities in both plots are not symmetric between anti- and pro-impeachment positions: for \(P > 0\), most users are concentrated in a small region of the \((P,P^{\textrm{NN}})\) and \((P,P^{\textrm{IN}})\) spaces, while for \(P<0\), they spread on a larger area. This means that users with extreme position \(P \simeq1\) are more likely to interact only with users that share the same extreme leaning, while users with \(P<0\) exchange information (send and receive tweets) also with peers that do not share their political opinion. These observations are also in consonance with the characterization of the community structure, as shown in Fig. 1(d), in which users with strong anti-impeachment leanings form a single, large community, and users with pro-impeachment leanings form several more heterogeneous communities.

The differences in the topological structure of the two communities can be related to the political context under study: while users characterized by anti-impeachment leanings refer to a more homogeneous political area (Partido dos Trabalhadores and small left-wing parties), pro-impeachment users share different political views, including center and right-wing positions, and show different levels of sympathy in favor to the impeachment. Another possible and important source of asymmetry is the constant release of content from other sources, in particular from the traditional media, broadcasting mostly contents that stimulate pro-impeachment leanings, possibly reinforcing their dissemination to a more diversified audience.

4 Effects of political position on information spreading

The presence of echo chambers implies that users mainly exchange messages with other users sharing similar leanings. This fact can have an impact on the way in which information is transmitted through the whole PC networks. A possible empirical way to gauge the effects of echo chambers on information spreading is to consider the number of RTs that a given user can achieve [23,24,25]. One can expect that more influential users, producing content that attracts more interest, will be rewarded by a larger number of RTs. Figure S12 (see SI) shows the number of RTs of the users, as a function of both his/her activity and position. One can see that the number of times that a user is retweeted is strongly correlated with the activity of that user. With hindsight, this observation is to be expected, since a user that produces many tweets gets a larger chance of being retweeted, within a homogeneous assumption of equal probability of retweeting. However, if we consider the number of RTs normalized by the total tweets sent, we observe a lack of evident correlations with users’ political position, as shown in Fig. S12(b) in SI.

In order to better understand the role of the network’s polarization in information propagation, we followed a different approach, by considering simple models of spreading dynamics. We have focused in the susceptible-infected-susceptible (SIS) and susceptible-infected-recovered (SIR) models [34], classical epidemic processes which have also been used to study the diffusion of information [35, 36]. In the SIS model, each agent can be in either of two states, susceptible or infectious. In the SIR model, an agent can also be in a recovered state in which it cannot be infected nor transmit the disease. Susceptible agents may become infectious upon contact with infected neighbors, with certain transmission rate λ in both processes. Infectious agents can spontaneously heal with rate \(\tau^{-1}\), becoming susceptible again or recovered in SIS and SIR, respectively. Within an information diffusion framework, a susceptible node represents a user who is unaware of the circulating information (e.g. rumors, news, an ongoing street protest), while an infectious user is aware of it and can spread it further to his contacts. A recovered agent is aware but not willing to transmit the information.

We ran the SIS and SIR dynamics on the temporal PC network, using the real timing of connections between users as given by the time stamps of interactions, so to ensure that the information diffusion follows time-respecting paths. In temporal networks characterized by an instantaneous duration of contacts, the infection process can be implemented by considering λ as a transmission probability, i.e. whenever a susceptible node i gets in contact with an infectious node j, node i will become infected with probability λ. The healing occurs spontaneously after a fixed time τ with respect to the moment of infection. We start the dynamics with only one node i infected, and stop it on the last interaction of the temporal sequence. The set of nodes that were infected at least once along the dynamics, started with i as source of infection, forms the set of influence of node i, \(\mathcal{I}_{i}\) [37]. The set of influence of a user thus represents the set of individuals that can be reached by a message sent by him/her, depending on the transmission probability λ and healing time τ.

For different values of λ and τ, we measure the spreading capacity \(S_{i}\) of each user i, defined as the relative size of his/her set of influence, namely

$$ S_{i}(\lambda, \tau) \equiv \frac{ \vert \mathcal{I}_{i}(\lambda, \tau) \vert }{N}, $$
(2)

by running a SIS or SIR dynamics with node i as seed of the infection, averaged over several runs. In Fig. 3 we plot the average spreading capacity \(\langle S \rangle\) of users as a function of their political position P and activity a for the SIS model. As expected, the more active are the users, the larger their spreading capacity (darker colors of the plots). However, one can see that \(\langle S \rangle\) is not constant with respect to the users’ political orientation: the spreading capacity is clearly smaller for users with anti-impeachment leaning, while it is larger for users with \(P<0\), reaching a maximum for \(P \sim-0.5\). Different values of λ and τ for the SIS and SIR model (available in the SI) show similar behavior.

Figure 3
figure 3

Heat map of the average spreading capacity \(\langle S \rangle\) of users, as a function of their political position P and activity a. The transmission probability of the SIS dynamics is \(\lambda= 0.5\) and \(\tau= 7\) days. Averages were performed over 100 runs

In order to disentangle the effect of the political position of users on spreading capacity from their activity, in Fig. 4 we plot the average spreading capacity of users as a function of their position, \(\langle S(P) \rangle\), for \(\lambda=0.2\) and \(\tau=7\) days. Other values are shown in the SI. Only users with activity bounded in the interval \(a \in[10,100]\) are considered, so as to ensure that the average activity of these users is relatively homogeneous with respect to their political position (as shown in the SI). Figure 4 shows that the average spreading capacity reaches a maximum for users with intermediate pro-impeachment position, \(P \simeq-0.5\), maximum that is up to four times larger than the value for anti-impeachment users. This striking difference is robust with respect to the values of the transmission probability λ and healing time τ. As shown in the SI, the shapes of the \(\langle S(P) \rangle\) curves are remarkably similar, even though significantly different values are reached. Analogous behavior is observed for the SIR model (see SI).

Figure 4
figure 4

Average spreading capacity \(\langle S (P) \rangle \) (black curve, left axes) of users with political position P. Average diversity \(\langle\sigma(P) \rangle\) (red curve, right axes) and average political position \(\langle\mu(P) \rangle\) (bars, top panel) of the set of influence reached by users with position P. Transmission probability \(\lambda= 0.20\) and \(\tau= 7\) days. Only the 11,386 users with activity \(a \in[10,100]\) are considered. Different ranges of a and values of λ are shown in the SI. Results are averaged over 100 runs, error bars represent standard errors.

5 Diversity increases spreading capacity

The origin of the large spreading capacity of users with pro-impeachment position cannot be traced back to their numeric prevalence in the network, since users are split into two groups of similar size; see Table S8 in SI. Moreover, the great majority of users are characterized by extreme position, \(|P| \simeq1\), yet they show a much smaller spreading capacity than users with intermediate pro-impeachment position, \(P \simeq-0.5\). One way to understand this difference relies in looking at the characteristics of the users reached by the spreading dynamics. One can analyze the political position of the set of influence \(\mathcal{I}_{i}\), by defining, for each user i, the average \(\mu_{i}\) and the variance \(\sigma_{i}\) of the political positions expressed by \(\mathcal{I}_{i}\), as

$$ \mu_{i}\equiv\sum_{j \in\mathcal{I}_{i}} \frac{ P_{j}}{ \vert \mathcal{I} _{i} \vert }, \qquad \sigma_{i} \equiv\sum _{j \in\mathcal{I}_{i}} \frac{(P_{j} - \mu_{i})^{2}}{ \vert \mathcal{I}_{i} \vert }. $$
(3)

The average \(\mu_{i}\) represents the average leaning of the users reached by i, while the variance \(\sigma_{i}\) represents how heterogeneously oriented \(\mathcal{I}_{i}\) is. A small variance \(\sigma_{i}\) indicates that the political position of \(\mathcal{I} _{i}\) is quite uniform and close its average value, while a large value of \(\sigma_{i}\) shows that \(\mathcal{I}_{i}\) has heterogeneous political positions. Therefore, the variance \(\sigma _{i}\) quantifies the diversity of the users reached by i.

In Fig. 4 (top panel) we plot the average political position \(\langle\mu(P) \rangle\) of the set of influence reached by users with position P, showing that users with pro-impeachment (neutral, anti-impeachment) leanings are more likely to reach, on average, users sharing the same pro-impeachment (neutral, anti-impeachment) leanings. This result (robust across different values of λ and τ, as shown in the SI) indicates that, given the strongly polarized structure of the network, information diffusion is biased toward individuals that share the same political opinion, quantifying the effect of echo chambers. The average \(\langle\mu(P) \rangle\), indeed, gauges the strength of the echo chambers: the more \(\langle\mu(P) \rangle\) is close to P, the stronger the echo chamber effect. Furthermore, one can note differences between pro- and anti-impeachment users: μ is almost constant for negative values of P, so echo chamber effects are weak, while μ is growing almost linearly for positive P, indicating stronger echo chambers effects.

Even more interesting, Fig. 4 shows that the diversity \(\sigma_{i}\) of the users reached by i strongly depends on his/her political position \(P_{i}\). The curve of the average diversity as a function of the political position, \(\langle\sigma(P) \rangle\), follows a behavior remarkably similar to the average spreading capacity of users with position P, \(\langle S(P) \rangle\). The strict correlation observed between \(\langle\sigma(P) \rangle\) and \(\langle S(P) \rangle\) indicates that if a user is able to reach a diverse audience, formed by users that do not share his/her average leaning, then the size of his/her set of influence is much larger. That is, individuals with large spreading capacity are able to break their echo chambers. Note that this result is not trivial since the size of the echo chambers are much bigger than the number of users reached. Moreover, the value of \(\langle\sigma(P) \rangle\) is statistically significant and does not depend on the number of users considered in the average. For instance, there are much more users with extreme orientations (\(|P| \simeq1\)) than users with intermediate position (\(P \simeq-0.5\)), yet it holds \(\langle\sigma(P \simeq-0.5 ) \rangle\gg\langle\sigma(|P| \simeq1) \rangle\). Furthermore, given the larger number of users considered, error bars for \(\langle\sigma(|P| \simeq1) \rangle\) are smaller than the ones for \(\langle\sigma(P \simeq-0.5 ) \rangle\).

6 Discussion

The effects of echo chambers on the openness of online political debate have been argued by the scientific community. Recently, it has been shown that echo chambers are expected to enhance the spreading of information in synthetic networks [38]. Their impact in real communication networks, however, remains poorly understood. The main contribution of this paper is twofold: (i) we quantify the presence of echo chambers in the Twitter discussion about the impeachment of the former Brazilian President Dilma Rousseff, showing that communities of users expressing opposite political positions emerge in the topological structure of the communication network, and (ii) we gauge the effects of such echo chambers on information spreading, showing that the capability of users to spread the content they produce depends on their political leaning. The use of spreading models allows us to characterize the internal structure of echo-chambers, showing that users belonging to the same echo chamber, with different convictions (i.e., the intensity of their leaning to the impeachment issue), can have quite different spreading capabilities.

Our method to quantify echo chambers is built upon two main ingredients: (i) we reconstruct a communication network based in mentions, in which people can actually discuss and exchange information related to ongoing political events, and (ii) we define a continuous political leaning measure, by classifying the hashtags used in tweets as expressing a leaning in favor or against the impeachment, which is independent by the network’s reconstruction. We then observe that anti- and pro-impeachment users clearly separate into different communities in the PC network. It is important to remark that, while it is well known that networks formed by RTs can be have a strong partisan structure, since RTs generally imply endorsement, this observation is new for mention networks, in which users characterized by opposite leanings can easily interact[11].

These two clusters of users sharing similar opinions, or echo chambers, can be characterized by looking at the correlations between the in-flow and out-flow of leanings, as well as between the average leanings of an individual and his/her nearest neighbors. The topologies of the two echo chambers are not exactly equivalent. Users expressing anti-impeachment leanings tend to lean towards the extreme, achieving a position \(P \simeq+1\), while users with pro-impeachment leanings show milder tendencies, reflected into the presence of medium-sized communities with overall negative political position.

We have gauged the effects of echo chambers on information diffusion by running simple models of information spreading, observing that, on average, users are more likely to receive information from peers sharing the same average leanings. We then see that people with predominantly pro-impeachment leanings are able to broadcast their message to a potentially larger audience than other users. Furthermore, such audiences are characterized by a greater diversity of opinions, indicating that pro-impeachment leanings can spread to both pro- and anti-impeachment users, a signature that echo chambers can be broken. At the same time, our method is able to identify those users trapped into the echo-chamber, who can be reached only by information shared by peers with similar political leaning, a feature that could be exploited to mitigate echo-chamber effects in online social media.

An interesting question arising here is what makes pro-impeachment users better spreaders than users with an opposite view. Recent works [39, 40] have related spreading efficiency of users with their topological position in the integrated network, in particular with the degree and centrality of individuals as measured by the k-core index [41]. In Fig. S17 of the SI we show that the average position P of users is quite uncorrelated with both their average degree k and k-core index, indicating that users characterized by pro-impeachment leaning cannot be singled out by such simple simple topological features. However, topological differences between the two echo-chambers can be seen in Fig. 1(d). While anti-impeachment users are clustered in a single, large community with strong average anti-impeachment leaning, pro-impeachment users distribute in several, smaller communities characterized by less extreme leaning.

It is important to highlight that our method for quantifying the echo-chamber effects by using epidemic processes comes at the cost of limitations. A first issue is that only very large communication networks can be analyzed, due to the extraction of the strongly connected component that greatly reduces the number of nodes. However, this procedure is essential to properly address the communication dynamics between users, and possibly avoid the presence of social bots. Furthermore, our definition of political position entirely relies on the hand tagged hashtags classification. It is well known that hashtags can be hijacked [42], i.e. they can be used by some users with a different (or opposite) purpose than the one originally intended, thus invalidating the leaning inferred through it. However, our analysis is based on a large number of hashtags, and it is robust with respect to a significant change of the leaning classification; see results for the additional classification in the SI.

Future research in this topic should address three main points. Firstly, more sophisticated methods for detecting political position of users, such as automatic sentiment analysis of tweet contents, could be considered. These methods are, however, not exempt from limitations [43, 44]. Secondly, more realistic models of information diffusion, such as complex contagions, independent cascade and linear threshold models [45,46,47,48], could be used to estimate individual’s spreading capacity. We have checked numerically that a modification of the classic Watts threshold model for complex contagion [45] to the framework of temporal networks [49] leads to the same behavior observed in the SIR and SIS models, see Fig. S18 of the SI. Therefore, while we do not expect our results to qualitatively depend on the details of propagation dynamics considered, interesting features may be added, such as a transmission probability that depends on the similarity between opinions. It would also be interesting to measure the evolution of users’ political position in time, as they are expected to not be constant over the whole temporal sequence. Finally, given that our conclusions are based in a single case study, it would be interesting to replicate our method in different data sets, to identify in a quantitative way the presence of echo chambers across differently polarized political contexts, over different social media.

7 Methods

Here we describe the empirical data used in the paper, available upon motivated request to the authors, and how we reconstruct the network from it, as well as the results of the hashtags classification. For further details, see SI.

7.1 Reconstruction of the PC networks

Our data set is composed of tweets collected daily from the public streaming of the Twitter API by specifying a list of 323 keywords (See Table S2 of SI) related to the impeachment process of the former president of Brazil, Dilma Rousseff. Data have been gathered between March 5th to December 31st of 2016. Only tweets including mentions to other users and at least one of the classified hashtags (see next Section) have been selected, while retweets have been discarded. Tweets containing hashtags of opposite leanings (\(l_{t}=+1\) and \(l_{t}=-1\)) are less than 1%, and have been discarded. The timing of the interactions has been preserved, so that in the temporal PC network a directed link from node i to node j at time t is drawn if user i sends a tweet by mentioning j at time t. Finally, the strong component of the time-aggregated version of the PC network has been extracted.

7.2 Hashtag classification

A list of the 495 most tweeted hashtags from the collected data has been classified by performing a manual annotation of the leanings (anti-, pro-impeachment, neutral, or not related to the issue) by four independent volunteers. Through an interactive webpage, the volunteers had the opportunity to browse Twitter for checking tweets containing the selected hashtag within the time window of interest. The final classification of each hashtag has been determined by the majority (3 of 4) of the opinions of the volunteers. A number of 321 (64.8%) hashtags had a full agreement, while in 443 (89.5%) of them at least 3 of 4 persons agreed. A majority agreement has not been reached for 52 (10.5%) hashtags, which have been excluded from the data set. Discrepancies between any pair of volunteers were less than 10%. A final list of 404 hashtags (see Table S3 to S6 in the SI for final classification) has been used to reconstruct the PC network. The number of interactions in the SCC with at least one of the classified hashtags were \(1{,}126{,}150\) for pro-impeachment, \(144{,}405\) for neutral, and \(756{,}498\) for anti-impeachment leanings, showing a slight tendency for pro-impeachment hashtags. Fig. S4 in the SI shows the usage count of the 100 most popular hashtags.

References

  1. Masum H, Newmark C, Tovey M (2011) The reputation society: how online opinions are reshaping the offline world. Information society series. MIT Press, Cambridge

    Google Scholar 

  2. Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  3. Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Van Alstyne M (2009) Computational social science. Science 323(5915):721–723

    Article  Google Scholar 

  4. Lee JK, Choi J, Kim C, Kim Y (2014) Social media, network heterogeneity, and opinion polarization. J Commun 64(4):702–722

    Article  Google Scholar 

  5. Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE, Fowler JH (2012) A 61-million-person experiment in social influence and political mobilization. Nature 489:295–298

    Article  Google Scholar 

  6. Aral S, Muchnik L, Sundararajan A (2009) Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proc Natl Acad Sci 106(51):21544–21549

    Article  Google Scholar 

  7. Hanna A, Wells C, Maurer P, Friedland L, Shah D, Matthes J (2013) Partisan alignments and political polarization online: a computational approach to understanding the French and US presidential elections. In: Proceedings of the 2Nd workshop on politics, elections and data, PLEAD ’13. ACM, New York, pp 15–22

    Google Scholar 

  8. Weber I, Garimella VRK, Batayneh A (2013) Secular vs. islamist polarization in Egypt on Twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining—ASONAM ’13. ACM, New York, pp 290–297

    Google Scholar 

  9. Borge-Holthoefer J, Magdy W, Darwish K, Weber I (2015) Content and network dynamics behind Egyptian political polarization on Twitter. In: Proceedings of the 18th ACM conference on computer supported cooperative work & social computing, CSCW 2015, Vancouver, BC, Canada, March 14–18, 2015, pp 700–711

    Google Scholar 

  10. González-Bailón S, Borge-Holthoefer J, Rivero A, Moreno Y (2011) The dynamics of protest recruitment through an online network. Sci Rep 1:197

    Article  Google Scholar 

  11. Conover M, Ratkiewicz J, Francisco M, Gonçalves B, Flammini A, Menczer F (2011) Political polarization on Twitter. In: Proc. 5th international AAAI conference on weblogs and social media (ICWSM)

    Google Scholar 

  12. Conover MD, Gonçalves B, Flammini A, Menczer F (2012) Partisan asymmetries in online political activity. EPJ Data Sci 1:6

    Article  Google Scholar 

  13. Garrett RK (2009) Echo chambers online?: politically motivated selective exposure among Internet news users. J Comput-Mediat Commun 14(2):265–285

    Article  MathSciNet  Google Scholar 

  14. Garimella K, De Francisci Morales G, Gionis A, Mathioudakis M (2018) Political discourse on social media: echo chambers, gatekeepers, and the price of bipartisanship. In: Proceedings of the 2018 World Wide Web Conference, WWW ’18, Republic and Canton of Geneva, Switzerland. International World Wide Web Conferences steering committee, pp 913–922

    Chapter  Google Scholar 

  15. Bastos MT, Mercea D, Baronchelli A (2018) The geographic embedding of online echo chambers: evidence from the Brexit campaign. PLoS ONE 13(11):e0206841

    Article  Google Scholar 

  16. Del Vicario M, Bessi A, Zollo F, Petroni F, Scala A, Caldarelli G, Stanley HE, Quattrociocchi W (2016) The spreading of misinformation online. Proc Natl Acad Sci 113(3):554–559

    Article  Google Scholar 

  17. Vicario MD, Vivaldo G, Bessi A, Zollo F, Scala A, Caldarelli G, Quattrociocchi W (2016) Echo chambers: emotional contagion and group polarization on Facebook. Sci Rep 6:37825

    Article  Google Scholar 

  18. Wojcieszak M (2010) ‘Don’t talk to me’: effects of ideologically homogeneous online groups and politically dissimilar offline ties on extremism. New Media Soc 12(4):637–655

    Article  Google Scholar 

  19. Barberá P, Jost JT, Nagler J, Tucker JA, Bonneau R (2015) Tweeting from left to right: is online political communication more than an echo chamber? Psychol Sci 26(10):1531–1542

    Article  Google Scholar 

  20. Dubois E, Blank G (2018) The echo chamber is overstated: the moderating effect of political interest and diverse media. Inf Commun Soc 21(5):729–745

    Article  Google Scholar 

  21. Holme P (2015) Modern temporal network theory: a colloquium. Eur Phys J B 88(9):234

    Article  Google Scholar 

  22. Lambiotte R, Masuda N (2016) A guide to temporal networks. Series on complexity science, vol 4. World Scientific, Singapore

    MATH  Google Scholar 

  23. Galuba W, Aberer K, Chakraborty D, Despotovic Z, Kellerer W (2010) Outtweeting the Twitterers—predicting information cascades in microblogs. In: Proceedings of the 3rd wonference on online social networks, WOSN’10. USENIX Association, Berkeley, pp 3–3

    Google Scholar 

  24. Jenders M, Kasneci G, Naumann F (2013) Analyzing and predicting viral tweets. In: Proceedings of the 22nd international conference on World Wide Web—WWW ’13 companion. ACM, New York, pp 657–664

    Chapter  Google Scholar 

  25. Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A, Menczer F (2011) Detecting and tracking political abuse in social media. In: Proceedings of the 20th international conference companion on World wide web—WWW ’11. ACM, New York, p 249

    Chapter  Google Scholar 

  26. Grabowicz PA, Ramasco JJ, Moro E, Pujol JM, Eguiluz VM (2012) Social features of online networks: the strength of intermediary ties in online social media. PLoS ONE 7:1–9

    Article  Google Scholar 

  27. Holme P, Saramäki J (eds) (2013) Temporal networks. Springer, Berlin

    Google Scholar 

  28. Newman MEJ (2010) Networks: an introduction. Oxford University Press, Oxford

    Book  MATH  Google Scholar 

  29. Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59:96–104

    Article  Google Scholar 

  30. Stella M, Ferrara E, De Domenico M (2018) Bots increase exposure to negative and inflammatory content in online social systems. Proc Natl Acad Sci 115(49):12435–12440

    Article  Google Scholar 

  31. Shao C, Ciampaglia GL, Varol O, Yang K-C, Flammini A, Menczer F (2018) The spread of low-credibility content by social bots. Nat Commun 9(1):4787

    Article  Google Scholar 

  32. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  33. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:P10008

    Article  MATH  Google Scholar 

  34. Anderson RM, May RM (1992) Infectious diseases in humans. Oxford University Press, Oxford

    Google Scholar 

  35. Zhao L, Cui H, Qiu X, Wang X, Wang J (2013) SIR rumor spreading model in the new media age. Phys A, Stat Mech Appl 392(4):995–1003

    Article  MathSciNet  Google Scholar 

  36. Granell C, Gómez S, Arenas A (2013) Dynamical interplay between awareness and epidemic spreading in multiplex networks. Phys Rev Lett 111:128701

    Article  Google Scholar 

  37. Holme P (2005) Network reachability of real-world contact sequences. Phys Rev E 71:46119

    Article  Google Scholar 

  38. Törnberg P (2018) Echo chambers and viral misinformation: modeling fake news as complex contagion. PLoS ONE 13:1–21

    Article  Google Scholar 

  39. Baños RA, Borge-Holthoefer J, Moreno Y (2013) The role of hidden influentials in the diffusion of online information cascades. EPJ Data Sci 2:6

    Article  Google Scholar 

  40. Alvarez R, Garcia D, Moreno Y, Schweitzer F (2015) Sentiment cascades in the 15m movement. EPJ Data Sci 4:6

    Article  Google Scholar 

  41. Seidman SB (1983) Network structure and minimum degree. Soc Netw 5:269–287

    Article  MathSciNet  Google Scholar 

  42. Hadgu AT, Garimella K, Weber I (2013) Political hashtag hijacking in the U.S. In: Proceedings of the 22Nd international conference on World Wide Web, WWW ’13 companion. ACM, New York, pp 55–56

    Chapter  Google Scholar 

  43. Gonçalves P, Araújo M, Benevenuto F, Cha M (2013) Comparing and combining sentiment analysis methods. In: Proceedings of the first ACM conference on online social networks, COSN ’13. ACM, New York, pp 27–38

    Chapter  Google Scholar 

  44. Conover M, Gonçalves B, Ratkiewicz J, Flammini A, Menczer F (2011) Predicting the political alignment of Twitter users. In: Proceedings of 3rd IEEE conference on social computing (SocialCom), pp 192–199

    Google Scholar 

  45. Watts DJ (2002) A simple model of global cascades on random networks. Proc Natl Acad Sci 99(9):5766–5771

    Article  MathSciNet  MATH  Google Scholar 

  46. Centola D, Macy M (2007) Complex contagions and the weakness of long ties. Am J Sociol 113(3):702–734

    Article  Google Scholar 

  47. Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. In: Proceedings of the 12th international conference on knowledge-based intelligent information and engineering systems, part III, KES ’08. Springer, Berlin, pp 67–75

    Google Scholar 

  48. Borodin A, Filmus Y, Oren J (2010) Threshold models for competitive influence in social networks. In: Proceedings of the 6th international conference on Internet and network economics, WINE’10. Springer, Berlin, pp 539–550

    Chapter  Google Scholar 

  49. Karimi F, Holme P (2013) Threshold model of cascades in empirical temporal networks. Phys A, Stat Mech Appl 392(16):3476–3483

    Article  Google Scholar 

Download references

Acknowledgements

We thank Gino Ceotto and Diogo H. Silva for useful discussion. This work was partially supported by the Brazilian agencies CNPq and FAPEMIG. Authors thank the support from the program Ciência sem Fronteiras—CAPES under project No. 88881.030375/2013-01. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001. MS acknowledges financial support by the J. McDonnell Foundation. RP-S acknowledges financial support from the Spanish MINECO, under Project No. FIS2016-76830-C2-1-P, and additional financial support from ICREA Academia, funded by the Generalitat de Catalunya.

Author information

Authors and Affiliations

Authors

Contributions

WC, SCF, RP-S, and MS designed and performed research. WC collected data and performed simulations. WC, SCF, RP-S, and MS wrote the paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Michele Starnini.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary information. (PDF 3.1 MB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cota, W., Ferreira, S.C., Pastor-Satorras, R. et al. Quantifying echo chamber effects in information spreading over political communication networks. EPJ Data Sci. 8, 35 (2019). https://doi.org/10.1140/epjds/s13688-019-0213-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1140/epjds/s13688-019-0213-9

Keywords