Skip to main content
Figure 5 | EPJ Data Science

Figure 5

From: Generalized word shift graphs: a method for visualizing and explaining pairwise comparisons between texts

Figure 5

Generalized word shift graph of the change in Shannon entropy for the 30 days before and after the 140 to 280 character change on Twitter. Words are relatively “surprising” (+) or “unsurprising” (−) depending on if their surprisal is higher than the entropy, or average surprisal, of words used in 140 character tweets. Many of the contributions in the top forty words are from unsurprising words being used relatively more (−↑) or less \((-\downarrow )\). The surprisal for each word went down () or up () depending on if it was used more or less, respectively, in 280 character tweets. The inset in the bottom left shows how the word shift scores cumulatively vary as a function of word rank, where the horizontal line in the middle of the plot indicates what is explained by the top forty words shown in the graph (see Materials and methods for more details). As seen by the top of the word shift graph and the cumulative inset, most contributions come from a long tail of relatively surprising words, despite mostly not appearing among the top forty words

Back to article page