Appendix 1: Explicit calculation of Kolmogorov-Smirnov p-value
The Kolmogorov-Smirnov p-value is a way to quantify the goodness of fit of some theoretical distribution to the empirical distribution of a dataset. For a given dataset \(\{ s_{1},s_{2},\ldots,s_{N} \}\), the corresponding empirical distribution is defined as
$$ M_{\mathrm{data}}(s) =\frac{1}{N} \sum_{j} \theta(s-s_{j}), $$
(7)
where θ is a step function and the variable s represents scores. The goodness of fit is obtained via such empirical distribution and a theoretical cumulative distribution (CCD). Thus, we need to define the rank distribution in terms of a CCD in order to use this criterion. Ref. [28] shows that there is an equivalence between an empirical rank-value distribution and the empirical cumulative distribution of scores (or frequencies) available from the data. The formula that relates these two functions is
$$ M_{i}(m_{i}) = \frac{N+1-k(m_{i})}{N+1}, $$
(8)
where \(m_{i}\) is the value of the theoretical rank distribution, and k the rank related to score s. So, to obtain the corresponding CCD of a rank distribution \(m_{i}\), it is enough to apply Eq. (8). Note that \(k=k(m_{i})\), i.e.
k is the inverse function of \(m_{i}\). The p-value will then measure how good \(M_{i}(m_{i})\) fits the empirical distribution of scores. Indirectly, we are obtaining a measure of goodness of fit of \(m_{i}\) to the empirical rank-value distribution, due to the equivalence stated in Eq. (8). In our case, the theoretical \(m_{i}\) is given by Eq. (2) and Eq. (3).
Next we define the Kolmogorov statistic D as the maximum distance between the empirical distribution of scores and the theoretical cumulative distribution,
$$ D = \sup_{s}\bigl\vert M_{i}(m_{i}) - M_{\mathrm{data}}(s) \bigr\vert . $$
(9)
We stress that when we talk about \(m_{i}\), value means a score in the system.
Finally, we describe the process used to calculate the p-value:
-
1.
Compute the parameters of fit \(m_{i}\) for the empirical rank-value distribution (scores).
-
2.
Obtain the empirical distribution of scores and the \(M_{i}(s)\) with Eq. (7) and Eq. (8).
-
3.
Calculate the Kolmogorov statistic D between \(M_{i}\) and \(M_{\mathrm{data}}\).
-
4.
Generate (e.g. 2,500) artificial datasets of scores, distributed according to the fitted \(M_{i}\). For each of them, fit to an artificial \(M_{i,\mathrm{art}}\) in order to obtain a value \(D_{\mathrm{art}}\).
-
5.
Count how many of the 2,500 \(D_{\mathrm{art}}\) values are larger than the D value of the real dataset and divide it by 2,500. The result is the p-value.
Appendix 2: Diversity and cumulative distribution
In previous work we have shown that, under very general conditions in which dynamic competition exists between positive and negative mechanisms, like birth and death processes, the rank distribution is given by the ratio of two power laws [32]. In this Appendix we analyse the difference between the data associated with different realisations of such competitive dynamics and the adjustments to real data in terms of stochastic models such as \(m_{2}(k)\), \(m_{3}(k)\), and \(m_{4}(k)\) given by Eq. (2). Specifically, we adopt the more general point of view that the data (obtained for Indo-European languages [12] and several sports and games) may be represented by a one-step Markovian stochastic process for the allocation of ranks.
The difference between the data associated with several realisations of the competitive dynamics and the adjustments to the real data may be analysed by treating k as a continuous variable. In this case, the time evolution of the probability density distribution of ranks \(P(k,t)\) is described by a Fokker-Planck equation (FPE),
$$ \frac{\partial}{\partial t}P(k,t)=-\frac{\partial}{\partial k} \bigl[ A(k)P \bigr] + \frac{\partial^{2}}{\partial k^{2}} \bigl[ B(k)P \bigr] , $$
(10)
where \(A(k)\) and \(B(k)\) are rank-dependent drift and diffusion coefficients, respectively.
Note that in Figures 1, 3 and 4 the abscissa is not the rank k, but \(x=\log k\). In other words, the systems exhibit a simpler behaviour in terms of the variable x, a fact that suggests a multiplicative behaviour and, in turn, a log-normal process. This process is the statistical realisation of the multiplicative product of many independent positive random variables, a feature that is justified by considering the central limit theorem in the logarithmic domain, and thus obeys the log-normal distribution. As a consequence, \(P(k,t)\) can be expressed in the general form
$$ P(x,t)=P^{\mathrm{st}}(x)+P_{1}(x,t), $$
(11)
with \(x=\log k\). The explicit form of the stationary distribution \(P^{\mathrm{st}}(x)\) is well known [33, 34], and the time dependent solution \(P_{1}(x,t)\) may be determined as follows. We first note that Eq. (10) may be rewritten as
$$ \frac{\partial}{\partial t}P(x,t)=\frac{\partial}{\partial x} \bigl[ B(x)P_{x} \bigr] + \alpha P_{x}+\beta P, $$
(12)
where \(\alpha=-A+B_{x}\), \(\beta=-A_{x}+B_{xx}\), and each subscript \(\bullet _{x}\) denotes a partial derivative with respect to x. This equation can be further simplified by introducing the variable \(v(x,t)\equiv B(x)P(x,t)\). Moreover, in order to simplify the discussion and the resulting equations, we consider the particular case where the drift and diffusion coefficients \(A(x)\) and \(B(x)\ \)are proportional to the same function \(g(x)\), i.e., \(A(x)=\lambda_{A} g(x)\) and \(B(x)=\lambda_{B} g(x)\). If \(\tau\equiv B(x)t\), then Eq. (12) reduces to
$$ \frac{\partial}{\partial\tau}v(x,\tau)=-\Lambda\frac{\partial v}{\partial x}+\frac{\partial^{2}v}{\partial x^{2}}, $$
(13)
with \(\Lambda\equiv\lambda_{A}/\lambda_{B}\). Let us now introduce the multiplicative character mentioned above by introducing \(u(x,\tau)\) through the following change of variables,
$$ \log\frac{v(x,\tau)}{u(x,\tau)}=\Lambda x-\frac{\Lambda^{2}}{4}\tau. $$
(14)
As a result Eq. (13) reduces to the diffusion equation
$$ \frac{\partial}{\partial\tau}u(x,\tau)=\frac{\partial ^{2}u}{\partial x^{2}}, $$
(15)
whose formal solution is a Gaussian,
$$ u(x,\tau)=\frac{1}{\sqrt{4\pi\tau}} \int_{-\infty}^{+\infty }e^{- ( x-x^{\prime} ) ^{2}/4\tau}u\bigl(x^{\prime},0 \bigr)\,dx^{\prime}. $$
(16)
Starting from some initial state \(x_{0}\), the distribution of the amount of time required for a stochastic process to encounter a threshold for the first time is known as the first passage time distribution (FPTD). We will now exhibit the relation between the diversity and the diffusion equation (15). To this end and to simplify the notation, in what follows we shall again use the symbol t to denote τ.
Consider the absorbing boundary \(u(x_{c},t)=0\), where the subscript c identifies the absorption point \(x_{c}\), and let \(u(x,t;x_{0},x_{c})\) denote the probability density satisfying this boundary condition for \(x< x_{c}\). The survival probability \(S(t,x_{c})\) that the particle has remained at a position \(x< x_{c}\) for all times up to t, is given by
$$ S(t,x_{c})\equiv \int_{-\infty}^{x_{c}}u(x,t;x_{0},x_{c})\,dx, $$
(17)
which is also the cumulative distribution of x at time t. Let the probability that a particle has reached the absorption point between times t and \(t+dt\) be \(h(t)\,dt=S(t)-S(t+dt)\). If we use a first order Taylor approximation, the first passage time distribution \(h(t)\) is then given by
$$ h(t)=-\frac{\partial S(t)}{\partial t}, $$
(18)
and the relation between the cumulative distribution \(S(t)\) and the FPTD (between two arbitrary times \(t_{1}\) and \(t_{2}\)) is [35, 36]
$$ S(t_{1})-S(t_{2})= \int_{t_{1}}^{t_{2}}h\bigl(t^{\prime} \bigr)\,dt^{\prime}. $$
(19)
Clearly, as shown in Figure 3, the diversity \(d(k)\) (that counts events having achieved rank k in a fixed time window) may be identified with the right hand side of Eq. (19). This equation shows, firstly, the relation between diversity and the diffusion equation (15). Secondly, since there is a relation between the solutions of the diffusion equation and random walks, there is also one between \(d(k)\) and the random walk model given by Eq. (6). We have already studied a particular case of these models in [12].