Skip to main content
Figure 1 | EPJ Data Science

Figure 1

From: Novelty and influence of creative works, and quantifying patterns of advances based on probabilistic references networks

Figure 1

Reference and Information Network formalism for Novelty of Creative Works. (A) A citation network for publications relies on self-reported bibliography connecting a new work to older works (solid arrows). Various factors as human error and bias may cause missing links (dotted), indicating loss of information. (B) A general probabilistic network is constructed by connecting two works that share a common element that represents a probabilistic reference. Unlike a citation network, therefore, no known source is omitted due to error or bias. Furthermore, since an element can be truly invented or come from unknown sources, the Novel Pool (statistical prior) is also considered. (C) Calculating the generation probability of a creative work given past works. The generation process is modeled as choosing elements from the Conventional Pool (CP) that contains previous used elements (with duplication, so that often used elements are common in the CP as well) or from the Novel Pool (NP) that represents ‘inventing’ the element or referencing an unknown source. Setting NP to contain a fixed number of each possible element corresponds to the Maximum A Priori estimator and additive Laplace smoothing, Eq. (6). The probability of choosing an element is proportional to its multiplicity in the combined pool. Consider an example work of length four, \(\zeta=\{1,4,6,8\}\). Here element 1 is the most common (four copies total, three from CP and one from NP), whereas 6 and 8 are the least so (one from NP only). The total number of those elements in the pool represent the generation probability of ζ. (D) Illustration by comparison between three example creative works \(\zeta_{1}\), \(\zeta_{2}\), \(\zeta_{3}\) composed of the same number of elements (four). \(\zeta_{1}\) is composed of the most used element 1, making it the most conventional (least novel), whereas \(\zeta_{3}\) is composed of unused elements only, giving it the highest novelty.

Back to article page