### Overall statistics

A total of 7,303,019 signatures were collected for the 19,789 petitions in the UK and 12,974,475 in the US site. Figure 2 shows that the distribution of these signatures is highly skewed, by plotting the total number of signatures for each petition against the rank order of the petition by total number of signatures. It is clear that a small number of petitions have been signed many times each, while a large number of petitions have only been signed a few times each (indeed, half the petitions received only one signature).

On the UK site, only 5 percent of petitions obtained 500 signatures in total, which is similar to the percentage achieving 500 signatures on the previous, No. 10 petition site [13]. Beyond this, 4 percent of the petitions received 1,000 signatures. Only 0.7 percent attained the 10,000 signatures required for receiving an official response, and 0.1 percent attained the 100,000 signatures required for a parliamentary debate. In the US site, 89 percent of petitions have more than five hundred signatures (note that the petitions in the dataset have at least 100 signatures), suggesting that those that have passed the first bar of 100 signatures have enough momentum to proceed. But only 15 percent reach more than ten thousand and only 0.7 percent reach more than one hundred thousand, the official measure of success.

### Outreach and growth

Despite the much larger threshold for success compared to the previous No. 10 platform (10,000 vs 500 signatures for an official response and the additional measure of success of 100,000 signatures for a parliamentary debate), a similar pattern in growth emerges suggesting that the first day was crucial to achieving any kind of success. Petition receiving 100,000 signatures after three months, obtained 700 signatures within the first 5 hours on average, 3,000 signatures within the first 10 hours, and 5,000 signatures within the first 24 hours (the averages for all the petitions with less than 100,000 signatures are 3, 7, and 12 signatures respectively). Although there are variations even between successful petitions, the general trend, which will be discussed in the next section, holds for all of the successful petitions. The external measure of 100,000 signatures as success is also clear in Figure 2: petitions rarely grow further once passing the 100,000 signature mark.

Figure 3 and Figure 4 show the number of signatures over time for two example petitions and for all the petitions in the dataset, respectively. In Figure 4 which shows the average growth curves for the cohorts of petitions with similar final number of signatures, it can be easily observed that even those petitions with a large number of signatures, collected the bulk of their signatures very shortly after launching. After a few days, the rate at which petitions grew with new signatures generally slowed significantly for all petitions.

At first glance these findings seem to contradict the normal assumptions of economists and sociologists, who have assumed the production function for collective action to follow an S-shaped curve, with the shape dependent upon the distribution of thresholds in the population [27–30]. Rather than a slow accumulation of supporters building up to critical mass, at which point support ‘tips over’ into success, petitions that have been successful in receiving large number of signatures demonstrate very rapid early growth, which decelerates overtime. We will discuss this point in more details and make numeric comparisons further below.

We attempt to capture the characteristic of early rapid growth and decay that the data reveal, with a model of ‘collective attention’ decay, drawing on Wu and Huberman [31]. In their model, they calculate a ‘novelty’ parameter relating to the novelty of news items on Digg (http://digg.com), a news sharing platform, that decays over time. In a more general framework, the decay in attention could occur for different reasons, for example reaching the system size limit, or lack of viral spread. We note that this minimalistic model is only one of many possible models that could be fitted to the data; however, the intrinsic simplicity of the model allows for characterization of the system-level growth behavior of the platform and easy comparison across platforms.

In the model, *N* agents at the time *t*, bring *Nμ* new agents in the next step on average, *μ* being a multiplication factor. In our case, this would mean that every signature on petition *i* brings \(\mu_{i}\) new signatures in the next hour, leading to exponential growth of rate \(\mu_{i}\) in the number of signatures. This model would fit the data we observe empirically quite well for the short period of time directly after a petition’s launch (see Figure 3). Very soon, however, the rate decays and new signatures come at a much lower rate.

As in the model of Wu and Huberman [31], we introduce a decay factor to capture this decrease. Specifically, we let the multiplication factor decay by introducing a second factor \(r(t)\), which decays in a way that is intrinsic to the medium: each signature at time *t*, on average brings \(\mu_{i} r(t)\) new signatures in the next hour. This ‘outreach parameter’ can change over time and dampen the fast initial growth, correcting for the early saturation observed in the empirical data. The growth equation then reads:

$$ N_{i}(t+1) = N_{i}(t) \bigl(1+\mu_{i} r(t) \bigr). $$

(1)

The number of signatures at time *t* can be written as:

$$ N_{i}(t) = N_{i}(0) \bigl(1+\mu_{i} r(0)\bigr) \bigl(1+\mu_{i} r(1)\bigr) \cdots\bigl(1+\mu_{i} r(t-1) \bigr). $$

(2)

In the limit of small time increments, Eq. (2) converts to:

$$ N_{i}(t) = N_{i}(0) \mathrm{e}^{\mu_{i} \sum_{t'=0}^{t'=t}r(t')}. $$

(3)

We can assume that the number of signatures at the beginning is one (the initiator of the petition), and therefore averaging of the logarithm of both sides of Eq. (3) leads to:

$$ \mathrm{E}\bigl[\log\bigl(N_{i}(t)\bigr)\bigr] = \mathrm{E}[ \mu_{i}] \sum_{t'=0}^{t'=t}r \bigl(t'\bigr), $$

(4)

where \(\mathrm{E}[\cdot]\) indicates the average over the whole sample.

In this framework, each petition has its own fitness and therefore an individual growth rate of \(\mu_{i}\), whereas *r* characterizes the overall outreach power of the platform as a whole. The outreach of the platform is assumed to be independent of the petition fitness and popularity. The disentanglement between these two factors enables us to calculate the outreach factor of the system by considering the whole sample of petitions and averaging over the logarithm of the number of signatures in hourly bins, starting from the time a petition is launched and then calculated in hourly increment at time *t* and normalized by the logarithm of the number of signatures up to time *t* as follows:

$$ r(t) = \frac{\mathrm{E}[\log(N_{i}(t))] - \mathrm{E}[\log (N_{i}(t-1))]}{\mathrm{E}[\log(N_{i}(t))]}. $$

(5)

We have calculated the outreach factor as a function of time according to Eq. (5) and illustrated it in Figure 5. The outreach factor decays very fast and, after a time span of 10 hours in the UK data and 30 hours in the US data, reduces to 0.1%.

The model holds, however, only when the growth rates of different petitions come from a localized distribution with finite average and variance. To check this condition, we calculate the ratio between the sample average and variance of \(\log(N(t))\) for different *t* and check the following linear relation holds:

$$ \frac{\mathrm{E}[\log(N(t))]}{\operatorname{Var}[\log(N(t))]} = \frac {\mathrm{E}[\mu_{i}] \sum_{t'=0}^{t'=t}(r(t'))}{\operatorname{Var}[\mu_{i}] \sum_{t'=0}^{t'=t}(r(t'))} = \frac{\mu}{\sigma^{2}}, $$

(6)

where *μ* and *σ* are the sample mean and the standard deviation of the individual growth rates. If the multiplicative model and the framework are valid, the ratio between the sample mean and the variance of \(\log(N)\) should remain constant over time. Figure 6 plots these two values and demonstrates the ratio does indeed remain constant. The root mean square of residuals from a diagonal line is 0.03.

Furthermore, Eq. (3) can be simplified by normalizing the number of signatures to the final number of signatures achieved up to time \(t_{\mathrm{max}}\), \(N_{i}^{\mathrm{norm}}(t)=N_{i}(t)/N_{i}(t_{\mathrm {max}})\), and considering that the number of signatures at the beginning of the process is one for all petitions \(N_{i}(0)=1\):

$$ N_{i}^{\mathrm{norm}}(t) = \mathrm{e}^{\mu_{i} (f(t)-f(t_{\mathrm{max}}))} $$

(7)

and

$$ \mathrm{E}\bigl[\log\bigl(N_{i}^{\mathrm{norm}}(t)\bigr)\bigr] = \mu \bigl(f(t)-f(t_{\mathrm{max}})\bigr), $$

(8)

where \(f(t)=\sum_{t'=0}^{t'=t}r(t')\) is a function only depending on time and the evolution of the novelty factor over time. Equation (8) suggests that if we plot the average of normalized growth curves for all the petitions in logarithmic space, they will collapse into a single curve. This average is plotted in Figure 7 along with the model fit and two logistic curves of the following forms:

$$ N(t)=\frac{N_{\mathrm{max}}}{1+B\mathrm{e}^{-Ct}} $$

(9)

and

$$ N(t)=\frac{N_{\mathrm{max}}D^{\nu}}{(D+B\mathrm{e}^{-Ct})^{\nu}} . $$

(10)

Since we have normalized the number of signatures to its maximum, we set \(N_{\mathrm{max}} = 1\). The curve described in Eq. (9) is the simple logistic function, aka S-curve, that now has two free parameters (*B* and *C*). In addition to that, we also check the fit for the curve described in Eq. (10) which has two more parameters (*D* and *ν*) allowing for an accelerated growth similar to what we observed in the data. We use an iterative least squares optimization method as implement in Matlab to find the best fit of the curves to the data.

To fit the model to the data in Figure 7 , we simulate the process described in Eq. (1) with \(r(t)\sim r_{0}t^{-2.5}\) (taken from Figure 5) and \(\mu_{i}\) taken from a uniform distribution with an average of *μ* (tuned for the best fit to the data through an iterative least square method).

It is evident that the explosive early growth is well captured in the model, whereas both S-curves fail to fit the data. The deviation between the data and S-curves is more evident in the semi-log and log-log plots of Figure 7. To further quantify the goodness of the fits, we calculated the normalized average residuals to be 0.013, 0.81, and 6.23 for the multiplicative model, sigmoid Eq. (9), and sigmoid Eq. (10) respectively. These results indicate that the model of Eq. (3) provides a better fit to the data and considering that Eq. (10) has four free parameters (compared to our model with three), one can confidently reject the superiority of a logistic S-shaped model.