Skip to main content
Figure 4 | EPJ Data Science

Figure 4

From: Socially disruptive periods and topics from information-theoretical analysis of judicial decisions

Figure 4

Kullback-Leibler (KL) divergence to measure surprise in the time evolution of topics. We use the KL divergence to measure the change in the topic distribution on year y with respect to the distribution on: (A-B) the year before; (C-D) a specific year \(y'\) in the past (for the housing corpus only). Although the number of decisions per year and the number of topics in the model is different for each corpus, we ensure the comparability of the results by using the same sampling factor (that is, the same ratio between the number of words used to estimate a topic distribution and the number of topics in the model; see Section 2). Error bars, often smaller than the symbols, correspond to the standard deviation of the mean over several sub-samples using a fixed sampling factor. We compute the KL divergence from the topic distributions at each hierarchical level within each topic model (see Fig. 2 and Section 2). Here, we show results for the following levels L and the corresponding number of topics K. (A, C) Word topics for housing: L = 1 and K = 564 word topics; homicides: L = 1 and K = 688 word topics; condominium: L = 1 and K = 525 word topics. (B, D) Legislation topics for housing: L = 2 and K = 30 legislation topics; homicides: L = 1 and K = 17 legislation topics; condominium: L = 1 and K = 27 legislation topics. Results obtained at other hierarchical levels are qualitatively very similar (Figs. S5-S7 in Additional file 1)

Back to article page