Skip to main content
Figure 5 | EPJ Data Science

Figure 5

From: Exploiting citation networks for large-scale author name disambiguation

Figure 5

Empirical and theoretical h -index distribution. (a) Testing the predictions of a stochastic h-index model with empirical data. Shown for each dataset is the empirical probability density function P(h), using logarithmic binning for h>10. We fit each P(h) to the model distribution P m (h), parametrized by only the distribution average, which is related to the mixing model parameters as 〈h〉= λ 1 λ 2 . (Inset) Data collapse of the empirical distributions along the universal curve K 0 ( h ; λ 1 λ 2 =1) (dashed grey curve) using the scaled variable x=h/〈h〉. (b) 6,498,286 clusters with h≥2 were identified for the entire WoS disambiguation. Plotted are the probability distribution P(h) (green circles), the best-fit model P m (h) with λ 1 λ 2 =3.49, and the complementary cumulative distribution P(≥h) (solid black curve). The numbers indicate the value associated with the percentile 100×(1−P(h)), e.g. 1 per 1,000 clusters (corresponding to the 99.9th percentile) has h-index of 64 or greater.

Back to article page