Skip to main content
Figure 7 | EPJ Data Science

Figure 7

From: The growing amplification of social media: measuring temporal and social contagion dynamics for over 150 languages on Twitter for 2009–2020

Figure 7

Language identification confusion matrices. We show a subset of the full confusion matrix for top-15 languages on Twitter. (A) Confusion matrix for tweets authored in 2013. The matrix indicates substantial disagreement between the two classifiers during 2013, the first year of Twitter’s efforts to provide language labels. (B) For the year 2019, both classifiers agree on the majority of tweets as indicated by the dark diagonal line in the matrix. Minor disagreement between the two classifiers is evident for particular languages, including German, Italian, and Undefined, and there is major disagreement for Indonesian and Dutch. Cells with values below 0.01 are colored in white to indicate very minor disagreement between the two classifiers

Back to article page