Evolution of the political opinion landscape during electoral periods

We present a study of the evolution of the political landscape during the 2015 and 2019 presidential elections in Argentina, based on the data obtained from the micro-blogging platform Twitter. We build a semantic network based on the hashtags used by all the users following at least one of the main candidates. With this network we can detect the topics that are discussed in the society. At a difference with most studies of opinion on social media, we do not choose the topics a priori, they naturally emerge from the community structure of the semantic network instead. We assign to each user a dynamical topic vector which measures the evolution of her/his opinion in this space and allows us to monitor the similarities and differences among groups of supporters of different candidates. Our results show that the method is able to detect the dynamics of formation of opinion on different topics and, in particular, it can capture the reshaping of the political opinion landscape which has led to the inversion of result between the two rounds of the 2015 election.


I. INTRODUCTION
Our understanding of how opinion is formed and evolves in society has benefited since the last decade from the rapidly increasing amount of data diffused by on-line social networks. In this way, large scale, cross-cultural studies that were difficult to perform based on standard off-line surveys become feasible.
In spite of the different biases that are known to affect studies based on on-line social networks, in terms of age, gender, residence location, social status, etc., the enormous amount of information they convey remains useful in particular to detect trends in the evolution of social opinion, at least restricted to the users of these platforms whose amount increases continuously. Moreover nowadays traditional broadcasting media, like radio or television, diffuse information or opinions selected from on-line social networks thus coupling this large but biased set of users with the general population.
The micro-blogging platform Twitter has been widely used in order to study the evolution of social opinion on different topics [8,14], as well as the properties of the social interaction networks that result from the different functionalities offered by the platform (mentions, retweets, followerfollowee) [5,17]. The intrinsic properties of the platform, like the small size of posts (tweets limited to 280 characters), or its simplicity of usage, (a message can easily be re-transmitted, a mentioned user can be alerted, etc.) make it an excellent tool to study situations where the time scale for the opinion's evolution is short as in electoral processes [1,10,14,24] or in social protests [2,7,18].
The possibility to predict opinion evolution using Twitter, which is of particular interest during an electoral campaign, has been seriously challenged [12,15,22]. Nevertheless, this kind of studies remain interesting because of their explanatory power. For instance, it was possible to unveil the spontaneous character of the emergence of off-line demonstrations during the Spanish social movement 15M by correlating the intensity of posts at a given location with the different gatherings observed off-line [7]. Also it was possible to identify the origin of a denigrating campaign against one of the potential candidates to the last French presidential election by studying the community structure of a network of retweets [14].
Twitter users share text messages, images and videos, and they may associate their posts to a concept, by the means of hash-tags (words beginning by the character "#"), in order to install a given concept to be discussed on the public ground.
Among a vast literature devoted to studies of social phenomena based on Twitter data, one can identify those studies that concentrate on the structure of the different networks that can be defined (follower-followee, retweets, mentions and answers), and those which try to infer the opinion dynamics, based on text mining and analysis. Both aspects, structure and content, are nevertheless entangled, as users are mainly exposed to the content produced by those other users that they have decided to follow or by those selected by the algorithms of the platform [5,17]. This situation has been shown to lead to the phenomenon known as echo chambers or bubbles, meaning that in fact, in the worldwide open field of Twitter (as well as that of other internet platforms) users are mostly and sometimes uniquely exposed to the same information, frequently the one that comforts their own opinion, thus limiting the possibilities of a real discussion [13,19,23].
An interesting way to study the structure of opinions in Twitter exploits the hashtags chosen by the users, assuming that this choice reveals a concept that the user wishes to address. In a recent work [11], topics are defined by determining the community structure in a weighted network of hashtags, where two hashtags are connected if they appear together in the same tweet. Assuming that the coexistence of hashtags is semantically meaningful, the community structure of such network can reveal the general topics under discussion. In this way, the users may be characterized by a topic vector, with a dimension equal to the number of communities detected and where each component informs about the interest of the user on the different topics. The authors show that the similarity among users connected by a follower-followee relationship or by a mention relationship is higher on average than the similarity among a sample of random users.
In this work we extend the ideas developed in [11] to a dynamical study of the rapidly evolving opinion landscape that takes place in a society during an electoral campaign. This method allows us to recover the dynamics of the political tendency without introducing questions to the population, which are known to be subject to different bias (of formulation, false declarations, etc.) and without imposing a priori, neither an ontology nor the number of topics to be inspected. Our method just extracts the information coded in the data with the only assumption that two hashtags used in the same tweet are semantically related. Our results show that, in spite of the limitations of studies of opinion using Twitter described above, this method is able to capture the opinion evolution of the users with a high enough time-scale resolution so as to detect, for example, the reconfiguration of the political landscape taking place in the short period between the first and second round of the election, which, in one of the cases presented here, overturned the score of the first round of the election.

A. Data Capture
This study is based on data captured during the two recent Argentinian presidential campaigns, The capture is based on the active followers (we define a user as active in Twitter if he/she posted at least one tweet during the first month of capture) of the candidates for president or for deputy-president of each of the main political parties participating in these elections. We filter those users whose profile location is set to some city/province in Argentina, in order to focus on those Twitter users that are residents in the country, and we capture and process all their tweets in the period. Table I gives a summary of the basic statistics for each dataset.
A detailed description is provided in the SI regarding the number of tweets captured daily and their classification as original tweets, simple retweets, retweets with comment and replies along with an analysis of the geographical and gender distribution of our user base.

B. Definition of topics and user's description vectors
Hashtags are keywords created and chosen by the users, which can be interpreted as representing the engagement of users with events, ideas or different discussion subjects. If two hashtags highly co-occur (i.e., they frequently appear together in the same tweet) it is a reasonable hypothesis to assume a semantic association between them. Following the ideas developed in [11], we build a complex weighted network based on hashtags' co-occurrence. Then, the topics of discussion arise as communities measured on this network, which we detect using the OSLOM algorithm [20]. It is worthwhile noticing that the topics simply emerge from the community detection algorithm which is completely agnostic regarding their meaning and does not pre-determine their number.
We describe the interests of each user i by means of a user description vector d i of dimension N T , the number of topics (communities) found, which informs about the topic preferences of user i.
This description vector is computed in the following way: 1. We build a user-topic matrix, U , where each element, u ij , gives the absolute number of times that user i has used a hashtag that belongs to the community identified as topic j.
2. We compute the global topic vector T = N i u i , where u i is the i-th row vector in the user-topic matrix, and N the size of the population. This vector gives the total number of times that each topic has been used by all the users in the dataset.
3. We define the vector v i which gives the difference between the frequency of usage of the topic by user i and its global frequency of usage in the population.
Here the norm ||.|| 1 must be understood as the sum over all the components in the space of dimension N T . The vectors of Eq. 1 thus inform about whether user i has addressed each of the identified topics more or less than on average.

4.
As we are only interested in the orientation of the description vectors, they are normalized as: where ||v i || 2 is the standard euclidean norm in the topic hyperspace of dimension N T .
a. Dynamical measurements. In order to track the evolution of the users' interests we apply the aforementioned procedure to sliding time windows of 7 days, thus producing a series of matrices U t , one for each day. We shall call d t i the description vector for user i at discrete time t. The full procedure is illustrated in Fig. 1.

C. Measuring the similarity between groups of users
We define the similarity between a pair of users i and j as the cosine similarity between the corresponding description vectors. As the latter are normalized, the similarity reduces to the inner product: We also define the average description vector of a group of users G, of cardinal |G|: Now we can introduce two indices measuring collective similarities: • The cohesion of a group of users, intra-group similarity or self-similarity, s(G, G), defined as the average similarity between all its users, and computed in the following way: • The cross-group similarity is the average similarity between members of different groups G 1 and G 2 , namely s(G 1 , G 2 ):

III. RESULTS
The results presented in this section are based on tweets collected as described in Section II, for the two last presidential elections in Argentina (details of the retrieval and cleaning methods of the data-set can be found in the SI).
As explained in section II B, we built the semantic network with the assumption that hashtags used in the same tweet carry some semantic similarity. The community structure of such network reveals the topics that are discussed in the society, and the description vectors allow us to characterize the interests of each user, in the topic space.
In Fig. 2 we show the structure of a topic (right panel) composed of hashtags supporting the Cambiemos party (C), one of the two major parties in the second round of the 2015 election. As the topics emerge from the community analysis without any a priori information introduced into the system (they are arbitrarily labelled by a number), it is the inspection of the hashtags included in each community that informs about the subject to which each topic is related. On the left panel we show the cumulative number of supporters of each political party referring to that topic. This shows that, although it is not always true that people choose a hashtag only to support the idea it conveys (notice that members of other parties, including the strongest opponent, FV, also use the considered topic), on average, our method does correctly capture the expected preferences of the users. On the right panel, the k-core structure of the sub-network of hashtags that compose this topic is shown. The k-core decomposition represents a graph as a series of layers (the cores) in which the core of index k is the maximal induced subgraph with minimum degree k [4]. In our figures, the node color represents the coreness of each hashtag (the largest core to which it belongs), and its size represents its degree (the number of hashtags it is connected to). A similar description of a topic in support of the largest opponent party, FV, can be found in the SI.
Argentinian law imposes to the citizens the obligation to vote. As a consequence, not only high participation rates are observed but also political discussions occupy a significant part of the public attention. In both elections the political opinion was highly polarized, with two main antagonist parties dominating the political spectrum. Other two or three smaller parties may, on certain occasions, play the role of a pivot for the determination of the final result; therefore understanding the evolution of the opinion of their supporters is a crucial issue. We will see that this was the case   The two main parties intervening in this election were Juntos por el Cambio (JPC), whose candidate was the incumbent, and the challenger (and previous ruler) Frente de Todos (FDT).   The ruling party shows, in general, a higher self-similarity than the others which increases as the PASO approaches, reflecting the strong cohesion of its supporters. This may be interpreted as the strong need to defend their governmental choices face to its several opponents. Interestingly, peaks of strong self-similarity are observed at different times, for the two minority parties at both extremes of the political spectrum, FI and FD. This is the signature of the occurrence of a particular event which triggers a coherent reaction among the supporters of one party while for the supporters of the other parties the reaction to that event is not discriminating. This happens when the event is in resonance with the political traditions of one of these parties.
It is worthwhile noticing that the description vectors of the users contain a large diversity of topics, many of which do not have a political character. When the public discussion is dominated by one of these topics (for instance a football championship) the differences among the supporters of different parties may be partially and temporarily washed out. This effect is enhanced far from the electoral dates, where we can observe that the all the parties fluctuate around the same value of self-similarity, except for the isolated peaks already mentioned.
In order to further investigate the observed isolated peaks, it is necessary to proceed to a careful inspection of the dominant topics at the considered date. This can be done using the platform we have created to analyze the evolution of the different topics [16]. Let us consider, as an example, the two sharp peaks observed in     4 shows that the peak of the FD self-similarity curve, located at March 21st 2019 (in black) corresponds to a demonstration against tax rises which took place that day in front of the National Congress. This is a subject that usually interests right and liberal parties, although JPC supporters (mainly liberals) did not comment on it on the same terms, because their party being in power, was responsible for the tax rise. In fact, an inspection of the topics that interested JPC supporters at that time using the platform [16] (cf. the smaller peak of the JPC curve (yellow) observed in Fig. 3), shows that this small peak does not correspond to the topic analysed here. visualisation and analysis that we have created [16].
Cross-similarity curves reveal other aspects of the evolution of the opinion during the electoral process. The brown curve in Fig. 6 shows the cross similarities between the two main antagonistic parties in 2019, JPC and FDT, compared to the respective self similarities (i.e., the same curves shown in Fig. 3), plotted as a reference. As expected, we observe that the higher the self-similarities, the lower the cross-party one. The cross-similarity strongly decreases in the vicinity of the primary and the main elections. Finally Fig. 7 shows the cross-similarity between the FI party and the two main competitors.
Although the leftist party has a very small influence in the country, the interest of this curve is to reveal that indeed its cross-similarity with FDT, is clearly positive (and negative with JPC). This is a clear evidence of the existence of a strong leftist component in FDT, as mentioned above, which is completely absent in JPC.

B. Elections 2015
The discussion of the results concerning this previous election, is interesting not only as a test of the validity of our method in a different political context, but also because unlike in 2019, this election required two rounds to establish the winner. Even more interesting, the rank of the two first qualified parties was overturned in the second round. We will show that our method is able to identify how the details of the dynamics of the opinion of the supporters of the smaller parties contributed to this final result.  The dynamics of the self-similarities, as well as that of the cross-similarity between the two largest parties, follow a similar pattern as those of 2019 elections and are detailed in the SI.
The most interesting feature of this electoral period is revealed by the cross-similarities between each one of the two smaller parties against the two leaders. Figure 8 shows

IV. SUMMARY AND CONCLUSIONS
We have performed a study of the dynamics of opinion during an electoral process, based on data obtained from the micro-blogging platform Twitter. While this subject has been explored in several works [1,9,10,14,21,24,26, 27], here we apply a different, user-centered, perspective of the discussions that are taking place in the platform. Most previous works on the subject define a set of keywords, hashtags or mentioned users (e.g., political candidates) to be tracked, and thus they obtain a dataset of tweets which are inherently political. Instead, by defining a set of seed users and capturing all the content that their followers generate, we have information about the evolution of the users' opinion on different topics, and we are not only restricted to a subset of their tweets.
Following Ref. [11], the topics to study are not set a priori, but emerge from the community structure of a semantic network. This network is built with the assumption that two hashtags used in the same tweet carry some semantic relationship. The disclosed communities provide a representation of the opinion of each user in a multidimensional topic space. In this work we add the temporal dimension to the topic vectors, and therefore we are able to study the dynamics of the opinion of party supporters during the electoral campaign with great detail.
As discussed in the introduction, the known biases of the population using Twitter to foster political discussions, which lead among others, to an over representation of an urban male population [6,25] hampers the possibility of prediction of electoral results. Instead, we show here how we can follow the evolution of political opinion through the different stages of an electoral period.
The case of the 2015 elections in Argentina shows that our method captures the details of the reshaping dynamics of the opinion that was decisive to overturn the results within the two rounds of the election.
Although we cannot expect to predict the outcome of an election, one could still attempt to detect massive opinion changes on real time. In this respect, it is worthwhile recalling some technical details in order to understand the possibilities and the limitations of the method developed here. In this work, the topics are determined by the community analysis of an aggregated semantic network, meaning that it has been built using the tweets collected during the whole electoral period. Tracking the semantic network in real time has the drawback of starting with a small network, with new hashtags entering as time evolves, which could hamper the correct initial determination of the topics by lack of data. A compromise situation would be to start by a semantic network aggregated during some initial period, in order to set the terms of the public discussion. From that starting point, one could then incorporate the new hashtags to the existing semantic network, following a sliding time window. In this way, the topics could be recalculated and the analysis of the similarities could be performed almost in real time with just a small lag. It is expected that the more hashtags enter the semantic network the more accurate will be the opinion landscape mapping. This assumption lays on the implicit hypothesis that the system tends to a steady (or meta-stable) state compatible with the aggregated network. However, this may not always be the case. It is easy to imagine that some event, like the present Covid19 pandemic, introduces at some point in time an important, short-time scale, modification of the semantic network. In such cases, integrating new hashtags as described above, may capture extreme patterns that could be caused by the appearance of a rare event triggering the modification of the structure of the semantic network, almost in real time.
This work is in progress.
twitter: What 140 characters reveal about political sentiment. In Fourth international AAAI conference on weblogs and social media.
[25] Vaccari, C., Valeriani, A., Barberá  We classified the tweets into 4 groups, using the following criterion: • Replies: tweets having the is reply flag active according to the Twitter API.
• Simple retweets: tweets having the is retweet flag active according to the Twitter API.
• Retweets with comment (quote tweets): tweets having the is quote flag active according to the Twitter API, but have the is reply and is retweet flags down. We observe that almost 55% of the tweets captured are simple retweets, while almost 20% are original tweets.
Each day, an average of 126k argentinian users in our dataset showed some activity in the platform (see Figure 10, left). This represents the 21% of our the base of active argentinian users (AAU). The average number of tweets captured per user was 477 or, in other words, 1.4 average tweets per user per day (Figure 10, right).

User affiliation
The affiliation of users to some political party was inferred from their follower relationships and retweets from the candidates' accounts. We consider two scenarios: • The user did not retweet from any candidate: In this cae, if he only follows candidates from a single party, then we infer that the user supports that party.
• The user retweeted from at least one candidate: In this case, we count his retweets from each party, and we normalize them so that they add up to 1. If some party gets a proportion of at least 0.75, then we consider that the user supports that party.
Users that do not fulfill any of these conditions are not considered in any of our similarity analysis. The political affiliation of the users was determined during the pre-election period, before the primaries.

Geographical and gender representativity
Each user's location was computed by combining the location informed in the user profile with data from GeoNames (https://www.geonames.org/). In the 2019 capture, for 392k users we could determine a specific location inside the country (either city or province), and for those we plotted their geographical distribution. Figure 11 shows that the capture has a fair geographical representativity at the province level. A similar representativity was found for the 2015 capture.
Twitter does not provide gender information in the user profiles. Thus, we analyzed the names in the profile by comparing them to a list of historical names from Argentina (https://data. buenosaires.gob.ar/dataset/nombres). In the 2019 capture, 32% of the profile names did not match any registered name (e.g., some users do not write their names but use aliases and/or emojis). For the remaining users (383k users), we found a 43% of females and 57% of males. In the 2015 capture we found a 55% of males and a 45% of females, after filtering a 32% of users with unknown gender.

Hashtag usage
Hashtags are used in Twitter as a tagging convention to associate tweets with events, concepts or contexts. We found that 14% of the tweets contain at least one hashtag. Some of these hashtags become very popular while others go unnoticed; the probability distribution of hashtag usage follows a heavy tail ( Figure 12, left), with 66% of the hashtags used by only one person, and 2, 600 hashtags adopted by at least 1, 000 users.
Detecting political hashtags. Some hashtags have a stronger political meaning than others. We designed a method to automatically identify political hashtags based on their usage by supporters of the different parties. In effect, we assessed the correlation between hashtag usage and political affiliation in terms of the Kullback-Leibler divergence (relative entropy): given the empirical usage of each hashtag h by users supporting each political party, we define a generalized Bernoulli distribution P h for each hashtag (represented as a vector containing the proportion of usages of hashtag h in each party with respect to all users), and a generalized Bernoulli distribution Q measuring the distribution of users across parties (i.e., the proportion of users supporting each party). Then, we compute the relative entropy D KL (P h ||Q) between these distributions for each hashtag h: where N C represents the number of parties. The idea behind this method is that apolitical hashtags h will be used independently of party affiliation, and their distribution will be similar to that of the party sizes, implying that D KL (h) ≈ 0, while strongly political hashtags will have higher relative entropy values. The observed distribution of these relative entropies for the 2019 election is shown in Figure 12 (right), together with the threshold of 0.5 that we used to classify hashtags into political or apolitical. Hashtags above that threshold will be considered political.
Weighted hashtag network. This network was build by computing the number of unique users that ever used each pair of hashtags together in a tweet. Counting the number of unique users  instead of the number of tweets allows for filtering the behavior of users that continually post similar content, and is a way of controlling for bot-generated content. The semantic association of 2 hashtags appearing in the same tweet is reinforced by the fact that many people use them together. We removed links whose weight is below a threshold of 5. Table IV shows a brief description of these graphs.

Robustness to fake accounts and bots
We made an effort to mitigate the impact of bots and fake accounts during the design of the method, by following these strategies: • As mentioned before, in the semantic network built by hashtag co-occurrence, each user is counted only once. That is, when we count the number of simultaneous usages of a pair of hashtags, we do not count the number of tweets but the number of unique users.
• When we measure the general usage vectors T i each user gets the same weight, because his description vector has been normalized. In this way, users with a very high activity do not distort the average behavior.
Contrary to other capture methods which are based on keywords, and which are more prone to capturing bot content, our method is based on chosen users (i.e., those who follow some candidate).
This reduces the effect that fake accounts could have in our Twitter capture. The left-wing parties FI and FDT were the ones most sensitive to these topics, as expected. In the case of Bolivia, it is interesting to point out that, while the small and more leftist FI was the first to raise concerns at the end of October (and as the election first round was taking place), the major left party FDT went ahead during November, with the election already decided. The plots in the right panel of each figure show the main hashtags involved in each case.
Finally, Figures 16, 17, 18 and 19 have the aim of validating the political affiliation of the users.
These figures show topics whose main content is the support to a specific party, and we observe that the majority of the users active in each topic are the supporters of the corresponding party.
However, Fig. 17 shows how M. Macri, candidate from Cambiemos (C), received support from final triumph. On the contrary, in Fig. 18 we observe scarce support for D. Scioli, from the other major party.  In Figure 20 we show analogous figures to the ones in the main text for the 2015 election: the self-similarity for each party during the period, and the cross-similarity between the two main parties. The latter reflects the same pattern as in 2019: the cross-similarity between the two main competing parties is consistently negative, reaching its minimum in the ballotage. Self-similarity for each party; (b) Cross-similarity between the two main parties.