Skip to main content
Figure 6 | EPJ Data Science

Figure 6

From: Design and analysis of tweet-based election models for the 2021 Mexican legislative election

Figure 6

Quantitative geographical analysis using official census results from 2020 and the subset (≈0.5%) of all of our Twitter data that contains geodata. Panel a shows the percentage of the population of Mexico (green), internet users in Mexico (red) and Twitter users in our data (blue) for the 32 states of Mexico. Mexico City (MX) is clearly over-represented and an outlier from the Twitter sample. The Pearson’s correlation coefficient (r) between the percentage of the population of Mexico and percentage of internet users in Mexico is \(r=0.98\), and decreases to \(r=0.67\) when comparing to our Twitter data. This value increases to \(r=0.96\) when we combine the data from MX, Hidalgo (HG) and the State of Mexico (MC), which encompasses the conurbation around Mexico City known as Greater Mexico City. The data are correlated. Panel b shows the residuals of the population of Mexico with respect to internet and Twitter users. This is done in a range where Mexico City is not visible. For the bulk of the data, the percentage of internet users is representative of the population within \(\lesssim 1.6\%\), while Twitter data is representative within \(\lesssim 3.0\%\)

Back to article page