Skip to main content
Figure 3 | EPJ Data Science

Figure 3

From: Untangling performance from success

Figure 3

Predicting visibility and popularity. (A) Scatter plot of predicted Wikipedia page-views \(W_{M}(t)\) and collected page-views \(W(t)\), using the parameters \(A=3.747\) and \(C=7\mbox{,}929\) determined from the page-view history for the first 2 years (training data). Error bars indicate prediction percentiles (10% and 90%) in each bin and are green if \(y=x\) lies between the two percentiles in that bin and red otherwise. The black circles are the average predicted page-views in that bin. (B) Comparison between Novak Djokovic’s observed Wikipedia page-views \(W(t)\) (blue) and his performance-based predicted page-views \(W_{M}(t)\). The model accurately captures the considerable lift in his career in 2011, when his page-views increased about an order of magnitude. (C) The total predicted Wikipedia page-views for each player, capturing a player’s predicted popularity, compared to the actual total Wikipedia page-views in the whole period considered in our analysis. The symbol colors represent the best rank the player reached during the considered period. The highlighted points correspond to the most significant outliers, whose modified z-value for the logarithmic distance between the prediction and data exceeds 3.5. The corresponding athletes are identified in the insert. (D) Separating the slow modes \(W_{S}(t)\), driven by career length \(Y(t)\) and rank \(r(t)\) (purple), and fast modes \(W_{F}(t)\), driven by tournament value \(V(t)\), number of matches \(n(t)\) and best better rivals term \(e^{\Delta r(t) H(\Delta r) /r(t)}\) (orange).

Back to article page