News sharing on Twitter reveals emergent fragmentation of media agenda and persistent polarization

News sharing on social networks reveals how information disseminates among users. This process, constrained by user preferences and social ties, plays a key role in the formation of public opinion. In this work, we used bipartite news-user networks to study the news sharing behavior of main Argentinian media outlets in Twitter. Our objective was to understand the role of political polarization in the emergence of high affinity groups with respect to news sharing. We compared results between years with and without presidential elections, and between groups of politically active and inactive users, the latter serving as a control group. The behavior of users resulted in well-differentiated communities of news articles identified by a unique distribution of media outlets. In particular, the structure of these communities revealed the dominant ideological polarization in Argentina. We also found that users formed two groups identified by their consumption of media outlets, which also displayed a bias towards the two main parties that dominate the political life in Argentina. Overall, our results consistently identified ideological polarization as a main driving force underlying Argentinian news sharing behavior in Twitter. Supplementary Information The online version contains supplementary material available at 10.1140/epjds/s13688-022-00360-8.


Keywords for the Twitter search of users
In order to select both set of users (political engaged and those who mention media outlets), we use the list of keywords listed in Table 1. Those users who tweeted using some of this words where selected.

Bipartite Degree for News and Users
The degree distributions of the bipartite networks for users and news are shown in Fig. S1. As it can be seen, the long tail distributions implies that there are users that share compulsively news, as well as there are news that are consumed multiple times. Therefore, an hyperbolic projection turns naturally, to mitigate the effects of this highly connected nodes.  Figure S1: Degree of news and users in the bipartite networks. On the one hand, the user degree refers to the number of news shared by the user; on the other hand, news degree stands for the number of users that shared a new.
In Table 2, we provide a summary of different topological properties of all analyzed networks. Although the amount of users selected in both datasets was the same, after discarding tweets that do not contain media outlet URLs, differences between them can be seen. Networks from politically active dataset are larger and less dense than those from control group. On one hand, control group news networks present less edges (E) and mean degree (< k >) than the politically active networks. On the other hand, control group users networks have more edges and < k > than politically active ones.

Users Networks News Networks
Politically  Table 2: Summary of the topological properties of the networks from both data sets. PASO and Control Group refer to the political engaged users data set and the users that mentioned media outlets data set, respectively.

Normalized Mutual Information between Partitions
In order to compare communities obtained from Louvain community detection algorithm, normalized mutual information was performed between the 20 main communities from partitions detected by three different methods:  Table 3: Normalized mutual information score between the partitions of the 20 main communities obtained from applying Louvain, Infomap and Label Propagation algorithms on both, users and news, networks.

Topic Description
In the main work, a topic decomposition of the news content in the set of politically active users was performed. In this section, word clouds for each topic of the main two communities of news for both years are shown. It should be noticed that in both 2019 communities, a topic related with the National Election emerge. Among the most frequent words (in terms of tf-idf) appears the names of the candidates of the two main political coalitions on the electoral dispute. This co-appearance aims us to compute the sentiment bias, as mentioned on the main work.

Analysis of the Control Data Set
As mentioned in the main work, a control data set was analyzed in order to asses the robustness of our original results. Analysis for both, users and news networks, can be found on the next subsections.

The news projection
Firstly, we present the results of the analysis of the news projection of the control group data set. In Fig. S4, a visualization of both, 2019 and 2020, networks are shown. As before, two main communities emerge, were each one is dominated by a given group of media outlets: on the one hand, Pagina 12 and El Destape; and on the other hand, Clarin, Infobae and La Nación.

The news projection
EPJ Data Science Figure S2: Topic description on the term space, mapped on word clouds, where the size of each word is proportional to the weight of the word on each topic. Center right and center left main communities of 2019 news network are shown. Word clouds boxed in red highlight the National Election topics of both communities, were candidate names appear.
Motivated by the previous network visualizations, we compute the cosine similarity between the media outlet distribution of the main communities. This similarity was computed between communities of the same year and also between communities of different years. As it is shown in Fig. S5, it is possible to identify two groups of communities: the Center-Right ones and the Center-Left ones.
Then, for the main two communities of each year, a media outlet distribution and a topic decomposition of the news content analysis were performed. Results are presented in Fig.  S6. It is notorious that both communities are dominated by different set of media outlets, as highlighted before. Figure S3: Topic description on the term space, mapped on word clouds, where te size of each word is proportional to the weight of the word on each topic. Center right and center left main communities of 2020 news network are shown.
Finally, we decide to compare the media outlet distributions between the main five communities of the control group news networks and the main five communities of the politically active news networks. The computed cosine similarities are shown in Fig. S7 We cab appreciate here the high degree of similarity in center-left and center-right groups between both datasets.

The users projection
Now, we present the results of the analysis of the users projection of the control group data set. In Fig. S8, a visualization of 2019 and 2020 networks are shown, along with the corresponding word clouds that represent the average media-consumed vector of the main communities.  Communities where colored by users news consumption. Those communities in which users tweeted mainly with links to center-left media outlets were colored in red and those with links to center-right outlets in blue. In Fig. S9, we compute the similarities between the average media-consumed vector of the main communities and the media-consumed vector of the users, for both years. As seen in the main analysis, a group of Center-Right and a group of Center-Left communities emerge.
And finally, the 2020 and 2019 users corrected media vector mapping is shown in Fig. S10. Those users previously detected in the same community are colored with the same color, red for center-right community and blue for center-community. It can be seen a similarly communities structure than politically active users network has. Also there is a clear difference in media   Figure S9: Similarities between users and average communities in media-consumed vectors.
[A] and [B] accounts for 2019 and 2020 data sets, respectively. The i,j-th element of each figure corresponds to compute de median of the cosine similarities distribution between the average media-consumed vector of the i-th community and all the users media-consumed vectors belonging to the j-th community.
Figure S10: 2020 and 2019 users corrected media vector mapping, after SVD transformation. Users belonging to communities identified previously as a block are coloured with the same color.