Remaining popular: power-law regularities in network dynamics

The structure of networks has been a focal research topic over the past few decades. These research efforts have enabled the discovery of numerous structural patterns and regularities, bringing forth advancements in many fields. In particular, the ubiquitous power-law patterns evident in degree distributions, graph eigenvalues and human mobility patterns have provided the opportunity to model many different complex systems. However, regularities in the dynamical patterns of networks remain a considerably less explored terrain. In this study we examine the dynamics of networks, focusing on stability characteristics of node popularity, and present our results using various empirical datasets. Specifically, we address several intriguing questions – for how long are popular nodes expected to remain so? How much time is expected to pass between two consecutive popularity periods? What characterizes nodes which manage to maintain their popularity for long periods of time? Surprisingly, we find that such temporal aspects are governed by a power-law regime, and that these power-law regularities are equally likely across all node ages.


Introduction
The study of complex systems and their structure has seen a growing interest in the past few decades. Discovering the existence of seemingly ubiquitous meta-structures such as the power-law patterns evident in degree distributions [1][2][3], graph eigenvalues [4] and human mobility patterns [5] has heralded the use of a network oriented approach for modeling, analyzing and predicting the macroscopic and mesoscopic behavior of "real-world" systems in a myriad of everyday fields and applications.
Equally important is the quest for patterns and regularities in network dynamics, since these could be used for analyzing and predicting the dynamics within a broad range of domains. To date research of network dynamics has focused on three main categories. The first and the most studied one is the spreading dynamics over a static network structure [6][7][8][9][10][11]. The second prevalent research direction involved analyzing the dynamics of individual-level user activity, for instance by establishing inter-event power-law distributions [12][13][14][15][16][17]. The third line of studies entailed exploring network dynamics-related characteristics, on a system-level perspective. These include shrinking diameters and network densification patterns [18], spectral evolution [19], community formation dynamics [20][21][22] and system-level bursty dynamics [23,24].
This study pertains to the third category, aiming at examining regularities in network dynamics, while focusing on network stability patterns, as are manifested through the popularity 1 of nodes. In principle, throughout the network's lifetime, non popular nodes may become popular and popular nodes might loose their status. In [25] we suggest a theoretical modeling which might explain such popularity changes. In the heart of this study, we analyze temporal aspects of node popularity periods, such as the time-span for which a "popular node" is expected to maintain its popularity status, the number of consecutive popularity periods per node, and the time it takes for a node to regain its popularity after losing it. We show clear statistical regularities with regards to all aforementioned processes, in the form of an adherence to a power-law model, across various and distinct empirical datasets. We further show that such power-law regularities are equally likely across all node ages.
In order to provide a complete view of this phenomena, we also examine two generative models and assess their ability to account for our empirical findings. Specifically, we inspect the prevalent Barabasi-Albert (BA) model (employing two parameter constellations) and the Temporal Preferential Attachment (TPA) model, which accounts for the system's aging processes. We find that while many of the dynamic patterns are also captured by the BA model, it fails to accurately account for the characteristics of highly popular nodes, for all different examined constellations. In particular, while empirical evidence advocate that long-term popularity applies to all node ages, the BA model is highly biased towards early-joiners (i.e. nodes which have joined the network at its early stages). The TPA model brings forth some advancement, as it is able to better qualitatively reproduce age-related phenomena, however the low statistical significance of its results implies for the necessity of further research. Our findings may shed a new light on node ranking dynamics, enhancing the understanding of node popularity shifts on the one hand and their fortification on the other, regardless of node ages.

Data
In this study we analyze empirical datasets from three distinct domains, as elaborated below:

Amazon ranking dataset
The Amazon Product Rankings dataset [26] contains product reviews and metadata from Amazon, including 142.8 million reviews spanning August 1997-July 2006. This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). We construct weekly bipartite temporal networks, containing product ratings from the Amazon online shopping website. In each such temporal network, nodes represent users and products. An edge between a user and a product is formed if the user rated the specific product within the given timespan. Previous studies that used this dataset for the modeling of network properties can be found for example in [27][28][29][30]

ERC20 Ethereum blockchain ledger
Launched in July 2015 [31], the Ethereum Blockchain is a public ledger that keeps records of all Ethereum related transactions. The ability of the Ethereum Blockchain to store not only ownership, similarly to Bitcoin, but also execution code, in the form of "Smart Contracts", has recently led to the creation of a large number of new types of "tokens", based on the Ethereum ERC20 protocol. These tokens are "minted" by a variety of players, for a variety of reasons, having all of their transactions carried out by their corresponding Smart Contracts, publicly accessible on the Ethereum Blockchain. As a result, the ERC20 ecosystem constitutes a fascinating example of a highly varied financial ecosystems whose entire activity is publicly available from its inception. This dataset was used in several network theory related studies [32][33][34] including financial assets adoption [35] and Malware and BOTs detection [36].
In order to preserve anonymity in the Ethereum Blockchain, personal information is omitted from all transactions. A user, represented by their wallet, can participate in the economy system through an address, which is obtained by applying Keccak-256 hash function on his public key. The Ethereum Blockchain enables users to send transactions in order to either send Ether to other wallets, create new Smart Contracts or invoke any of their functions. Since Smart Contracts are scripts residing on the Blockchain as well, they are also assigned a unique address. A Smart Contract is called by sending a transaction to its address, which triggers its independent and automatic execution, in a prescribed manner on every node in the network, according to the data that was included in the triggering transaction.
We have retrieved all transactions spanning from February 2016 to January 2019, resulting in 179,488,619 transactions, performed by 27,888,847 unique wallets, trading 79,451 distinct tokens. We construct weekly bipartite temporal networks, containing cryptotokens transactions on top of the Ethereum Blockchain. Nodes represent trading wallets and crypto-tokens. An edge between a wallet and a token is formed if the wallet bought or sold the given token in the examined timespan.

eToro financial trading dataset
The financial transaction data used in this work was received from an online social financial trading platform for foreign exchanges, equity indices and commodities, called eToro [37,38]. This trading platform allows traders to take both long and short positions, with a minimal bid of as low as a few dollars, thus providing access for retail traders to investment activities that until recently were only available for professional investors. A key feature of eToro's "social trading" platform is that each trader can easily see the complete trading history of other investors. Investors can then set their accounts to copy one or more trades made by any other investors, in which case the social trading platform will automatically execute the trade(s). Accordingly, there are three types of trades: (i) Single (or non-social) trade: Investor A places a normal trade by himself or herself; (ii) Copy trade: Investor A places exactly the same trade as investor B's single trade; (iii) Mirror trade: Investor A automatically executes Investor B's every single trade, i.e., Investor A follows exactly investor B's trading activities (and implicitly their investment decisions). Both (ii) and (iii) are hereafter referred to as social trading, and can be regarded as decision making that is based on information received through the common social medium.
The data that was analyzed for this work encompasses approximately 3 million registered accounts, containing over 40 million trades during a period of 3 years. We construct weekly temporal networks, based on the mirroring activity of users on top of this platform. 2 The nodes represent traders (followers and followees). An edge (v 1 , v 2 ) between two traders is formed if trader v 1 mirrored the trading activity of trader v 2 . Previous studies of this dataset can be found in [39][40][41][42][43][44][45][46].

Temporal networks
We define temporal networks as follows.

Definition 2.1
The temporal graph for a given timestamp t, G t (V t , E t ) is the directed graph constructed from all transactions performed during the time period [t -, t). The set of vertices V t consists of all entities participating in the network activity during that period: and the set of edges E t ⊆ V t × V t is defined as: The temporal degree of a vertex v ∈ V t is defined as:

Definition 2.2 Given a time-stamp t ∈ [T]
, the rank assigned to node v ∈ V t , according to its degree deg t (v), is denoted by rank t (v). The ranking is performed in a descending order, such that the rank of 1 corresponds to the highest degree node in V t . Specifically, rank t (v) = r if there are r -1 nodes with a higher degree than v: Ties are broken randomly, by a random internal ranking of groups containing identical degree nodes. Given a thresholdT, a node v will be referred to as popular if its rank is amongst the top-T nodes:

Fitting a heavy-tailed model
In this work, we find that the examined distributions present heavy-tailed patterns. In order to substantiate this hypothesis and determine the exact model best representing these distributions, we compared four plausible heavy-tailed models: 1. The power-Law model: The Truncated Power-Law model: 3. The Exponential model: 2σ 2 ) To this end we applied a prevalent statistical framework [1] encompassing two main stages. Namely, given a heavy tailed model M: Stage 2: compare plausible models Compare all plausible models which were not rejected in the previous step using a likelihood ratio test. The log likelihood ratio test calculates the likelihood of the given data between two competing distributions. The logarithm of this ratio is positive or negative depending on which model presents a better fit, or is zero if a tie is obtained. The sign of the log likelihood ratio is subject to statistical instability and when close to zero, the fluctuations can change its sign. In order to establish the statistical significance of the log likelihood ratio sign, we calculate its standard deviation and corresponding p-value, where small p-values indicate that the established sign is a reliable estimate of model compatibility.

Barabasi-Albert (BA) model simulations
Introduced in 1999 [47], the BA Model was based on the discovery that a common property of many large networks is that vertex connectedness follows a scale-free power-law distribution. This feature appears generically in expanding networks where new vertices attach preferentially to already well connected sites. The proposed model managed to reproduce various stationary scale-free distributions, indicating that the development of large networks is governed by robust generic self-organizing phenomena that are agnostic to the particularities of the examined system. The BA Model has served as the basis of numerous studies in various scientific fields, including social networks analysis [9,48,49], computer communication networks [50], biological systems [51], transportation [52,53], IOT [54], emergency detection [55], financial trading systems [40,41,44] and many others.
In order to substantiate our empirical findings, we examine the dynamics established by the Barabasi-Albert (BA) model. First we generate a Barabasi-Albert scale-free network G(V , E) over n = 100,000 nodes. A single node is added at every iteration, each outputting ∼ Norm(μ, σ ) for μ = 100, σ = 20 Specifically, each temporal network G t (V t , E t ) is composed of all edges whose tags are within the following range: and all the nodes that participated in these iterations:

Trendy preferential attachment (TPA) model simulations
The Trendy Preferential Attachment model [56] is a forget-based extension to the BA model. It presents a network evolution model where edges become less influential as they age. The diminishing influence is modeled by a monotonically decreasing function f (τ ) of their age τ . We have chosen to apply f (τ ) ∝ 1/(τ 2 ). As such, the probability of a new node to connect to another node v in time t is proportional to its time-weighted degree as follows: where deg t (v) is the actual (not time-weighted) degree of node v at time t.
We start by generating a TPA network G(V , E) over n = 100,000 nodes. A single node is added at every iteration, each outputting m = 20 edges, which are preferentially attached to existing nodes, proportionally to their time-weighted degree. Similarly to the process we have performed with the BA model, we tag each edge with the iteration at which it was added to the network. Next, in order to construct temporal networks from G(V , E) we follow def. 2.1 with = 100.

Empirical analysis
We examine the dynamics of nodes popularity levels over time, as measured by their degree-based rankings. In particular, for each temporal graph G t we rank all nodes in V t according to their associated in-degree, in a descending order (consider def. 2.2 for further details) and examine these ranks over time. Table 1 presents a description of the ranked items in each dataset.
We start by presenting the popularity dynamics of several randomly chosen nodes from three real-world datasets, as qualitatively depicted in Fig. 1. We observe that nodes' popularity periods vary considerably in length, and that certain nodes can regain high levels of popularity even after massive drops in popularity.
In the rest of this section, we examine the dynamics of popularity from a system perspective. For a given temporal graph, a node is considered to be popular if it was among the top 100 highest degree nodes. 4 We start by examining the distribution of node popularity    Tables 2,  3, Additional file 1). This result is rather surprising, as one might expect that once a node manages to join the "most popular list" it would maintain its popularity status for long periods of time. Real-world systems however do not abide by these rules. Instead, we note that the vast majority of popular nodes remain popular only for short periods of time and only a minority of nodes manage to preserve their popularity status for long periods of time.
Next, we analyze the distribution of the number of popularity sequences per node. As presented in Fig. 3, all three empirical datasets follow a truncated power-law distribution (See supportive statistical analysis in Tables 2, 3, Additional file 1). Interestingly, this implies that the vast majority of nodes, after losing their popularity status, will remain non-popular. However, there are the selected few who manage to regain popularity over and over.
We also examine the distribution of gap lengths (in weeks) between consecutive popularity sequences of each node. As depicted in Fig. 4, all three empirical datasets (Amazon (panel A), Blockchain (panel B) and eToro (panel C)) follow a truncated power-law distribution (See supportive statistical analysis in Tables 2, 3, Additional file 1). This result suggests that most nodes which manage to regain their popularity, do so after short periods of time. Nonetheless, and somewhat counter-intuitively, few nodes manage to 'resurrect' and become popular again even after a rather long time.  Upper panels present a heat-map depicting popularity sequence lengths as a function of node inception time, with coloring standing for the number (log-scale) of nodes with a given popularity sequence length and inception time. All three empirical datasets (Amazon (panel A), Blockchain (panel B), eToro (panel C)) suggest that long popularity sequences apply to nodes of all ages. Lower panels depict the distributions of popularity sequence lengths, each with respect to different inception-related subgroups of nodes. All three empirical datasets (Amazon (panel D), Blockchain (panel E), eToro (panel F)) across all inception-related categories seem to follow a truncated power-law model Finally, we are interested in examining the characteristics of long-term popular nodes. Specifically, Fig. 5 depicts the connection between popularity sequence lengths and node "inception" times (the time on which a node was first introduced to the network). We first observe that long-term popularity has an approximately uniform spread across nodes of all ages, for the examined datasets (panels A-C). Furthermore, we find that the scale-free nature of the distribution of popularity sequence lengths (as exhibited in Fig. 2) is also "age-free" (panels D-F). Namely, even when removing X ∈ {0%, 10%, . . . , 90%} of the oldest nodes from the network, the popularity distribution associated with the remaining subnetwork still follows a truncated power-law model (see 6.4, Additional file 1 for statistical support).

Generative models analysis
We next examine the popularity dynamics established by two network-evolution models. We start by exploring the well-known Barabasi-Albert (BA) model. This model, being one of the most prevalent and well-studied models for network evolution, was the first to ac- Figure 6 Dynamics analysis of the Barabasi-Albert model. Panel A depicts the popularity sequence lengths distribution, panel B manifests the number of popularity sequences distribution, and panel C presents the distribution of gap lengths between consecutive popularity sequences. All three distributions seem to follow a truncated power-law model, in accordance with our empirical evidence. Panel D presents a heat-map depicting popularity sequence lengths as a function of node inception times (node tag), indicating that long popularity applies solely to early-joiners. Panel E depicts the distributions of popularity sequence lengths, each with respect to different inception-related subgroups of nodes. In contrast to empirical evidence, popularity longevity is highly biased towards early-joining nodes count for the formation of power-law patterns in network structure 5 (namely, heavy-tailed degree distributions). It is therefore interesting to verify the extent to which it manages to reproduce the power-law distributions observed in the network dynamics and their temporal characteristics. To this end, we construct temporal networks from a BA model simulation (see Methods section for details). Interestingly, we note that the BA model succeeds in capturing all three distributions (panels A-C, Fig. 6). However, it fails to reproduce the connection between nodes' popularity and their inception times, as seen in our empirical examples. In particular, the BA model is highly biased towards early-joiners becoming popular for long periods of time (panel D, Fig. 6) and its inception-time related distributions do not generally follow a power-law distribution (panel E, Fig. 6). Consider Sect. 6.4, Additional file 1, for supportive statistical analysis and Sect. 6.2 for a further analysis of a different BA temporal network configuration.
We continue by analyzing the dynamics established by a forget-based extension to the BA model. We believe that a mechanism which allows recent activity to have a heavier impact on the edge attachment process, might prevent the heavy popularity tilt towards early-joining nodes. In particular, we analyze the Trendy-Preferential Attachment model (TPA) [56] which presents a network evolution model where edges become less influential as they age. We construct temporal networks from a TPA model simulation (see Methods section for further details). Interestingly, we note that such forget-based mechanism Figure 7 Dynamics analysis of the Trendy Preferential Attachment (TPA) model. Panel A depicts the popularity sequence lengths distribution, panel B manifests the number of popularity sequences distribution, and panel C presents the distribution of gap lengths between consecutive popularity sequences. All three distributions seem to follow a truncated power-law model, in accordance with our empirical evidence. Panel D presents a heat-map depicting popularity sequence lengths as a function of node inception times (node tag), indicating that long popularity applies solely to early-joiners. Panel E depicts the distributions of popularity sequence lengths, each with respect to different inception-related subgroups of nodes. This forget-based mechanism manages to reproduce both the power-law dynamics their age related characteristics manages to qualitatively reproduce both the power-law distributions and their age characteristics (Fig. 7). However, when employing GOF analysis to the results, we find that both the dynamical distributions and the age-related distributions obtain rather low statistical significance for their power-law fit (see Sect. 6.4, Additional file 1.)

Discussion
In this study, we examined various characteristics related to the dynamics of node popularity in networks. In particular, we have analyzed the lengths of time periods for which nodes attain high popularity, the number of such periods per node, and the distribution of time gaps between two such consecutive periods. We have shown that truncated powerlaw patterns accurately describe these characteristics of network dynamics within three distinct empirical datasets, providing what may be the first evidence for these particular power-law regularities in network dynamics. We further examined the characteristics of long-term popular nodes. We show that across all three examined datasets, node tendency towards long popularity periods is not affected by their joining time to the network. Furthermore, we found that this scale-free property is also "age-free", as the power-law distribution is evident across all age categories.
While the Barabasi-Albert (BA) model manages to capture some of these dynamicsrelated characteristics, it fails to accurately account for the connection between popularity dynamics and node ages. In particular, it shows a considerable bias towards early-joiners, having long popularity periods, in sharp contrast to the real-world networks we have examined. It is important to note however, that the BA temporal networks in our simulations differ from real-world temporal networks since they are restricted to the exact number edges, resulting in upper bounded temporal degrees. Further research is required in order to fully comprehend the effect this restriction has on the established results. Nevertheless, a preliminary analysis we have performed (consider Sect. 6.2, Additional file 1 for further details) examined the effect of Gaussian noise added to the amount edges each temporal network consists of, and presented results consistent with the original BA specifications. We speculate that the origins of this mismatch between the BA model and the empirical evidence are rooted in the likelihood of popular nodes to be of a given age. Indeed, while the BA framework is heavily skewed towards early-joining popular nodes, the empirical datasets exhibit a roughly uniform distribution of inception times among popular nodes (see supportive analysis in Sect. 6.5, Additional file 1). This suggests different forces are behind the empirically established power-law distributions.
Employing a forget-based extension of the BA model (TPA), we found that it is able to qualitatively reproduce the examined dynamical patterns, and has a better agreement with the age characteristics of popularity. Nevertheless, the low statistical significance of its results suggests the need of further research in order to understand the forces and mechanisms behind the observed dynamics and their age-related characteristics. Such efforts might include examining other recent network evolution models [56][57][58][59][60] and developing new generative models to account for these findings.
Furthermore, since the presented analysis was focused on economy-related datasets, it is intriguing to verify whether the established regularities are a specific characterization of economical networks, or whether they actually describe any social network, regardless of its domain. Accompanied by the increasing availability of temporal empirical data, these research directions could enable much deeper understanding of dynamical regularities, and impact domains ranging from biology to social science.