Evaluating structural edge importance in temporal networks

To monitor risk in temporal financial networks, we need to understand how individual behaviours affect the global evolution of networks. Here we define a structural importance metric—which we denote as le\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$l_{e}$\end{document}—for the edges of a network. The metric is based on perturbing the adjacency matrix and observing the resultant change in its largest eigenvalues. We then propose a model of network evolution where this metric controls the probabilities of subsequent edge changes. We show using synthetic data how the parameters of the model are related to the capability of predicting whether an edge will change from its value of le\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$l_{e}$\end{document}. We then estimate the model parameters associated with five real financial and social networks, and we study their predictability. These methods have applications in financial regulation whereby it is important to understand how individual changes to financial networks will impact their global behaviour. It also provides fundamental insights into spectral predictability in networks, and it demonstrates how spectral perturbations can be a useful tool in understanding the interplay between micro and macro features of networks.


Introduction
Understanding how individual edges in a network influence its structure and evolution is important in a range of applications.Considering financial networks, network structure has implications for financial stability [1], market efficiency [2] and consumer safety [3].Identification of players to monitor more closely is of paramount importance to regulators and policy makers, with many attributing the severity of the 2008 crisis to systemic flaws in the banking ecosystem [4].Our research focuses on understanding how individual edges affect the structure of networks, and how this relates to network stability and evolution.We present a brief review of related literature, first considering individual effects on network structure, then those that link network structure to stability and systemic risk, before considering how network structure relates to temporal evolution.We then define a measure for structural edge importance l e , and we propose a model for network evolution in which an edge's importance can be indicative of future changes.Our results show that l e values are higher for edges which appear to play a more important structural role, and that subsequent changes occurring in the real networks analysed depend to some extent on the value of l e .

Individual effects on network structure
The effect an individual node or edge can have on a network's structure depends not only on the scale of its activity, but also on its position within the network, and the activity of neighbouring nodes and edges.Understanding these interrelations remains one of the key challenges in network science.
Recently, structural node importance has gained a large amount of attention due to its relevance in use cases across a wide range of fields [5].Methods have predominantly focused on network spectra, in order to illicit structural information from the network adjacency matrix.This includes numerous studies of epidemic processes, in which it is intuitive that the removal of a node that acts as a bridge between communities can be used to stem the spread of a disease, leading to significant effort being taken to understand the influences of community structure on epidemic spreading [5,6,7].Similar applications include preventing network-based attacks [8,9] and understanding and actioning on the spread of gossip in society [10].This idea of network resilience is often approached from the angle of percolation theory, in which the percolation threshold governing the appearance of a giant component is often related to the leading eigenvalue of the adjacency matrix [11,12].An alternative lens is taken by Wang et.al. [13] who make use of the observation that the spectrum of the adjacency matrix gives an indication of community structure.In noting that for a network with c strong communities, the c largest eigenvalues of the adjacency matrix are significantly larger than the others, they follow a perturbation based approach to define node importance as the relative change in the c largest eigenvalues of the upon the node's removal.Similar to Wang et.al., Lü et.al. [14] propose a universal structural consistency index for a network-based on perturbing the adjacency matrix and demonstrate that this index is a good index for link predictability.Our work considers the same central concept of applying perturbations to the adjacency matrix and focusing on the change in the leading eigenvalue, however differs in that we are working at a lower level of granularity by proposing an edge-based measure, rather than node or network-based.
Many works in the financial literature focus on node specific influence on stability.For example, Battiston et.al. [15] define a node ranking coined DebtRank, which takes recursively into account the impact of distress of an initial node across the whole network.Their measure amounts to the fraction of the total economic value in the network that is potentially affected by the distress or default of a specific node.They applied their method to a network of loans from the Federal Reserve to financial institutions between 2008 and 2010, enriched with equity investment relations, and found a strongly connected core of 22 institutions which all became too systemically important to fail at the 2008 crisis peak.They demonstrated the effectiveness of their node ranking in comparison to other centrality measures, and found that it was the only measure to deliver a clear response well before the crisis peak.However, their method specifically considers the case of distress propagation, and does not explicitly measure how an individual node or edge affects the structure of the network in general.Barucca et.al. [16] investigate whether a change to few selected banks in the network of the e-MID1 market can affect the large scale structure of a network through node removal or degree mutation, and comparing the network structure that results to the original.
Although the bulk of the attention has focused on importance of actors in networks, Helander et.al. [17] propose a method for characterising the relative importance of an edge, which they refer to as edge gravity.Edge gravity measures how often an edge occurs in any possible network path.They show that important edges are not necessarily adjacent to nodes of importance as identified by standard centrality metrics, and they also observe that high centrality nodes often have their centrality over-represented by being adjacent to 'edges to nowhere'.Similar path-based methods include the BCC M OD (Betweenness Centrality and Clique Model) proposed by [18], which weights the importance of the two nodes forming the endpoints of the edge with the number of cliques containing the edge.Their method outperforms several well-known methods including Jaccard coefficient and betweenness centrality in identifying critical edges both in network connectivity and spreading dynamic.In our work we define the importance of an edge in terms of the change that a small perturbation on the edge would induce in the leading eigenvalue of the weighted adjacency matrix of the network.While other definitions could be considered, we focus here on the leading eigenvalue because it determines for instance the stability of spreading processes on social networks [19,20], or financial shocks on inter-bank networks [21].Our methods contrast the above-mentioned path-based approaches by instead considering a network spectrum-based approach, however both approaches show strong connections to node centrality measures; as shown in section 2, an approximation to the network eigenvalue derivative is proportional to the product of the constituent nodes' centralities.In addition, our research focuses on the temporal behaviour of the network in relation to structural importance, for which future work could consider using alternative measures of structural importance to understand the expected temporal behaviour.

Network structure in relation to stability and systemic risk
Increasing complexity and stability are inextricably linked, with works as early as May's investigations into ecosystems with increasing biodiversity highlighting the relationship [22].In the context of financial markets, although market integration and diversification are widely believed to play a stabilising role [23,24], Bardoscia et.al. [25] demonstrated that two factors of increasing complexity, namely increasing the number of institutions (nodes) and contracts (edges) in an interbank network can drive the system to instability.Similarly, Markose et.al. [26] present the idea of institutions being 'too interconnected to fail' through an exploration of the structure of the US CDS market.They consider an empirical network constructed from market shares, and make use of the May-Wigner condition for stability2 in comparison to a random network.They show that although the CDS structure shows better outcomes than a random network when subject to shocks, the demise of any one big player will bring down other big players.Caccioli et.al. [27] showed in a theoretical exploration that uncontrolled proliferation of financial instruments can lead to large instability in markets, and suggest potential interventions such as the introduction of a Tobin tax [28], which is shown by Bianconi et. al. to have a stabilising effect [29].Related to this, Brock et.al. [30] used 'arrow securities' as a proxy for more complicated hedging instruments, and found that these incentivise construction of larger positions, resulting in a reinforcement effect due to large gains/losses as a result of being on the 'right' or 'wrong' side of the market.They showed that this is associated with greater instability, and also that the primary bifurcation parameter, marking the onset of instability, occurs earlier when there are more arrow securities.In contrast to the majority of the data centric financial literature which focuses on interbank trading, Bardoscia et.al. [21] analysed UK Trade Repository data, which includes all transactions occurring through a Central Counterparty clearing house (CCP) in the UK.Considering a snapshot of the open positions on a single day for interest rate derivatives, FX derivatives and credit default swaps as a three layered network, they compared a ranking derived from the centrality measures to a ranking derived from modelling the network's response to liquidity contagion, looking at how shocks propagate across the network and translate into payment deficiencies across the different markets.The model considers the stress faced by an institution -the difference between all payments it is required to make and all payment inflows from counterparties, and allows stress to spill over between the layers.They found that centrality measures can be used as a proxy for the vulnerability of financial institutions.

Network structure in relation to temporal evolution
To understand how networks evolve across time, many researchers have focused on studying the mechanisms for network growth, and defining network models to understand the origin of observed properties of real networks [31,32,33,34,35].These include the Barabasi-Albert model [36], which demonstrates that scale-free degree distributions observed in real networks can be explained by the presence of growth and preferential attachment in the network evolution.Falkenberg et.al. [37] present a simple adaptation to the Barabasi-Albert model, in which new nodes attach to nodes in the existing network in proportion to the number of nodes one or two steps from the target node.This results in an implicit time dependence, which arises from a node's attractiveness being dependent on its local environment which changes as the network evolves.Central to their model is the idea that network structure and temporal evolution are inherently linked, however their model is limited to the influence of local environment.Others focus on considering temporal networks as multilayer networks, in which one can account for the fact that connectivity patterns in different layers can depend on each other.Bazzi et al. [38] proposed a generative model which explicitly incorporates a user-specified dependency between layers that is flexible enough to incorporate complex interlayer relationships such as dependencies between a layer and all layers that follows, incorporating memory effects into the model.A handful of studies have attempted to link global network structure to temporal evolution, such as Peixoto et.al. [39], who suggest dynamical variation of the degree-corrected stochastic block model that is capable of finding meaningful large-scale temporal structures in real-world systems and predict their temporal evolution.Their method works with both discrete and continuous time representations, making it versatile to a range of applications.Watts et.al. [40] consider semi-random 'small world' networks and show that the dynamics are an explicit function of the network structure, and also show find an enhanced propagation speed for small world networks.
A common and general framework for network growth is the fitness model, in which each node has associated with it a time independent 'fitness' which represents its propensity to attract links, as proposed by Barabasí and Bianconi [41].They find that different fitnesses results in multiscaling in the dynamic evolution, or in other words that the time dependence of a node's connectivity depends on the fitness.Attempts have been made build on this model in order to understand the origins of network dynamics, such as a recent study by Kobayashi et.al. [42].They find that population and activity dynamics are sufficient to explain two types of scaling empirically observed in real networks, however their methods do not explicitly allow for different roles to be captured within a network, by assuming a uniform distribution of fitness parameters.In our research, we explore instead how an edge level quantity derived from the spectrum of a network can similarly be used to determine which edges change in the network.We present methods for estimation of parameters which control both the overall activity in the network, as well as the bias to change for edges with a larger structural importance, and we show how these reproduce behaviours observed empirically.
In the following sections, we look to address two questions: Can we quantify the extent to which an edge affects the overall network structure, and does this provide information on the network's temporal evolution?We know from the above that network structural information can be gained from the network spectra, both from the observation that the threshold for the appearance of a giant component in a network relates to the leading eigenvalue, and in that the number of communities can be determined from the number of well separated eigenvalues.We also see that the leading eigenvalue provides an indication of stability in terms of dynamical processes occurring on the network.Our aims are to understand node importance in terms of network structure and stability, so we thus look to capture both of these in our analysis through considering the derivatives of the network's leading eigenvalue with respect to individual nodes or edges.We present evidence that this measure could be a useful indicator in understanding temporal changes in network structure, and we present the results of its application to five real networks.Our main results demonstrate that the elementwise derivative of the leading eigenvalue (l e ) can be predictive of subsequent change for five different networks analysed, and that predictability can be related to the specific realisation of two parameters, α and ρ in the network evolution model in which edges change with probability αl ρ e .This has potential implications for stability, as a system experiencing more changes to edges of structural dominance could see a reinforcing effect, leading to an unstable system.These methods could be useful in classifying financial asset systems to inform regulation activities and policy making.We further show that the scale of resultant changes can be related to the realisation of two additional parameters β and γ, again with potential stability implications.

Definition of temporal networks
Traditionally, network analytics has focused on static representations of networks, either looking at single snapshots in time, or considering a projection of the time dimension onto a static view by aggregating the links in a time window.In doing so, some, or all, of the temporal information about the network is lost.However, recently, there have been developments in the modelling of systems as temporal networks, for which the system is represented by a contact sequence (i, j, t), where i and j constitute the vertex set V at time t.This representation also allows for edges that take time to traverse, or contracts completed after a duration δt by representing the contact sequence as (i, j, t, δt) [43].Since we are considering transactions as instantaneous, we are not interested in transmission time for edges, and we are considering applications where time is discretised, we can formally define a temporal graph G w t (t min , t max ) as in [44] as the ordered sequence of graphs where w is the size of the time aggregation (e.g.daily).Element A s ij of the adjacency matrix at time s is 1 if and only if there exists a link between i and j in G t , t ≤ s ≤ t + w.
This differs from a time sequence of static graphs in that the edges in the temporal network need not be transitive, i.e.A → B, B → C =⇒ A → C, and it also allows for time to be continuous, meaning the full topological structure and correlations are captured.Temporal networks can be extended to include weights associated with the edges.

Central concept -eigenvalue derivatives as a measure of importance
For a given graph G t (V, E) with adjacency matrix A t ij , the eigenspectrum of A t ij is the set of eigenvalues λ that satisfy the equation By observing changes in the eigenspectrum of a graph, we can gain an insight into structural changes.As we are looking at network snapshots across time, we have a 'time series' of graphs and we can consider the change in the leading eigenvalue between successive time snapshots, where we have made a first order approximation, and the derivative is with respect to the (i, j)th entry of the matrix, as opposed to the entire matrix.Here λ refers to the leading eigenvalue of the adjacency matrix.
The two parts of equation 3 can be seen as a playoff between the potential of an edge to influence the structure ( ∂λ ∂Aij ) and the actual change in the network structure (∆A ij ).Our experiments with synthetic networks look to assess the extent to which our derivation below, which makes approximations and assumptions, captures the true behaviour.The first term measures the sensitivity of the eigenvalue to changes in an individual edge, which we refer to as the structural importance of an edge and denote by l e .We derive approximations for l e in equation 4 for the undirected case by taking a perturbation theory approach.Although not explicitly explored in this paper, we also present equation 5 for the directed case: where λ 0,i refers to the ith component of the eigenvector corresponding to the leading eigenvalue, s A refers to the leading singular value of the adjacency matrix and λ M 0,i refers to the ith component of the eigenvector corresponding to the leading eigenvalue of M = AA T .Full derivations for these can be found in appendix B, and we validate the approximation for the undirected case in results section 4.1.We see here that both equations 4 and 5 are proportional to the product of the eigenvector centralities of the nodes involved in the edge.Our definitions are defined in terms of the eigenvector corresponding to the largest eigenvalue, which usually has non-zero values only for the largest connected component of a network.For this reason, in this paper we restrict ourselves to exploring the giant component of the networks, however generalising these to allow for disconnected components will be considered in future work.
We can capture the relationship between l e and subsequent edge changes by observing the distributions of P (∆A = 0| ln(l e )) and the joint probability P (∆A, l e ), which we explore in detail in the results sections 4.2.5 and 4.3.Our findings from these are compared to our model for the temporal evolution of networks, which we propose in section 3, to assess the extent to which our model captures the true behaviour observed.
The second term considers the changes that subsequently occur in response to the value of l e .This is of significance from a stability perspective; edges that are structurally important could cause a system to become unstable by changing frequently or by a large amount.Conversely, they may also act to stabilise a system if it begins to move towards a regime of instability.This can be explored by considering temporal graphs as described by equation 1, and assuming that the evolution is Markovian.We consider this first of all in the proposal of a model for network evolution, parameterised by the extent to which l e is indicative of the propensity of an edge to change, and the scale of the resultant changes.We further assess the predictability of changes from the value of l e through the use of a logistic regression classifier, and relate the performance of this to the model parameters.

Model for network evolution
In order to understand the relation between structural importance and stability of a network over time, we need a model that captures two behaviours.The first of these is that the value of l e is indicative of the probability for an edge to change, and the second is that the size of a resultant change can be related to l e .
We thus propose a model in which we can control the extent to which l e influences a subsequent edge change, both in probability of occurrence and resultant scale.Specifically, we propose a model in which the network evolution exhibits the Markovian property as in [32,39]: where V ij ∼ B(α(l e ) ρ ) and U t ij is the distribution of edge changes.Here we introduce two parameters which control the probability of an edge to change -ρ which controls the level to which the value of l e influences the probability for an edge to change, and α scales V ij to ensure that it is a valid probability.A positive value for ρ indicates that more important edges are more likely to change, and a negative ρ would indicate the opposite.
The simplicity of this model means that we are unable to account for edges appearing and disappearing in the network.We will look to incorporate this in future research.

Parameter estimation in real networks
Assuming that our data evolves according to the model in equation 6, we can use observations from real networks to estimate the most likely values of α and ρ from the data.Following a maximum likelihood approach, we can derive estimations for these parameters, by maximising the following log-likelihood as proposed in appendix E: Where θ e = αl ρ e , and k e is the observed outcome of edge e.We note here that since α and ρ are constrained to result in a valid probability calculated from αl ρ e , the minimisation is subject to constraints and must satisfy the Karush-Kuhn-Tucker conditions [45].In practice, numerical optimisation of the log-likelihood in equation 7 was used to estimate α and ρ.

Structural influence and network predictability
Depending on the values of the parameters for a given dataset, we might expect the observed values of l e to be predictive of subsequent change.Specifically, since ρ controls the relationship between l e and the propensity for an edge to change, a high value of ρ would suggest that l e would be more predictive of future change.Similarly for α, within the constraints for αl ρ e to give the probability of an edge to change, a larger α factor will increase the distance between change probabilities for edges with different l e , thus also strengthening the relationship between the value of l e and the propensity for an edge to change.In order to evaluate these effects, we make use of logistic regression for classification of edges into changing vs. unchanging from the values of l e , and compare the results to a null model consisting of the average over multiple trials in which edges randomly change with probability equal to the fraction of observed changes.The data is split into training and test sets in a stratified manner, with 20% used to test the model on unseen data.The predictions are compared according to balanced accuracy, defined as the average of recall obtained on each class, and Area Under Curve scores for both Receiver Operating Characteristic curves and Precision Recall curves.

Validation of l e using toy networks
Here we assess the extent to which the approximations made in calculating l e hold.We do this by approximating the change in eigenvalues as the coefficient weighted sum of the edge weight changes, ∆λ = e l e ∆A e , and comparing the gradient of this to the value of l e .Our derivation of l e makes the simplification in assuming that edge changes occur independently of each other.Our first test thus considers the case of an individual edge changing at each timestep, and we consider perturbations applied to a barbell graph, to observe the effects of network structure, a ring graph, to observe the effects of weight with structural equivalence, and a Erdős-Rényi (ER) graph as a baseline.The results in figures 1, 2 3 show the line of constant l e , overlaid with the observed ∆A e and corresponding ∆λ values.We see here that our linear approximation generally holds for relative edge changes less than ∆A = 0.05.We also see for the barbell graph that l e captures the structural role of the edges, with edges in the cliques having higher values of l e  than those in the bridge.For the ring graph, we observe a poorer fit for edges with low values of l e , and the larger l e edges tend to be adjacent to edges with similar l e values.Although the edge with the largest weight also has the largest value of l e , in general there does not appear to be a simple relationship between edge weight, or weight of neighbouring edges, and the value of l e .For the weighted random network we see similar observations are made for the weighted ER graph, with the lowest l e values observed for more peripheral edges, and the two edges with the largest weights also having the highest l e values.Further results for the case of a weighted barbell, and unweighted ring and random networks are shown in appendix A.
Results for the case of two edges changing are also shown in appendix A. In these we observe for the barbell graph better fit is observed for higher values of l e .For the ring networks and random networks, we see that our model performs well if the observed edge has a larger value of l e than the other changing edge, but performs poorly when the value of l e is smaller.The case of complete structural equivalence and equal weights in the unweighted ring network shows good performance for all edges.
The breakdown of the method when there are multiple changes occurring between snapshots suggests that our approximation for l e may be better suited to a continuous or pseudo-continuous representation of a temporal network, which can be seen as the limit of a discrete temporal network in which each snapshot captures an individual edge change occurring at an infinitesimally different time to the neighbouring snapshot changes.

Relationship between l e and the presence of edge changes
We can understand the role of the parameters α and ρ by observing the effect of varying the parameters on the distributions of the values of l e for changing vs. non-changing edges, P (∆A = 0| ln(l e )).We first consider this for data generated according to our model in equation 6, first keeping ρ fixed and varying α, then fixing α and varying ρ.

Model with varying ρ
Figure 7 shows the distributions of P (∆A = 0| ln(l e )).We see here that for increasing ρ, the probability of observing no change increases, and also for increasing l e , the probability decreases for a given ρ, at a rate that shows a significant dependence on ρ.

Predictability improvement with α and ρ
As detailed in section 3.2, here we apply a logistic regression classifier with single feature l e , to datasets with varying α and ρ.Figures 8a and 8b show the improvement in the test set Precision-Recall Area Under Curve scores for increasing values of each parameter.We see from these that increasing both parameters improves the predictability of changes given the value of l e , consistent with our observations of the rate of increase of change probability being positively correlated with both α and ρ.

Static observations in real data
We have seen in the above in application to synthetic networks that our model behaves as expected, with networks with a large ρ (and α) being more predictable.Now we explore the performance of our structural influence metric and model through the application to five real datasets.Firstly, given that our research has been motivated by a need to monitor risks in a financial setting, we considered a network of country level bilateral trade [46] and three different capital markets transaction datasets reported under MIFID II regulations.However, our methods can be applied more generally to any temporal networks, and due to the availability and high volume of research conducted into social networks (see [10]), we also considered a network of messages sent between College students [47].A full description of these can be found in appendix D.
In order to understand the usefulness of l e as a metric for structural importance, we first examine the edges that rank the highest according to their values of l e for the bilateral trade dataset, since the historical context of international trade can give us an idea of which edges we might expect to be 'important'.For the bilateral trade dataset, we see the largest values of l e for the edge between Portugal and Spain in 1872, and considering the sum across all time, for Greece and Turkey.These are examples of edges with both nodes having large eigenvector centrality; edges involving only one central node are seen to have lower values of l e .This means that inter-European edges almost exclusively make up the top 100 ranked edges, whereas the lowest ranked l e edges occur when one, or both, of the nodes have very low centrality scores.Similarly, for the other datasets, the highest values of l e were also observed for edges involving nodes with high eigenvector centrality.In general, we see that the rankings of l e are uncorrelated with the rankings of edges according to their betweenness centrality, or their mean value of ∆A, however do for some cases correlate with the product of the participating node's degrees and strengths.As these datasets contain large numbers of edges (the smallest contained 2785 edges), we cannot fully explore all of the individual observed values of l e as for the toy networks.Instead, we consider the probabilities of observing values of l e by making use of Kernel Density Estimation to estimate the probability density functions from the data.
Figure 9 shows the estimated Probability Density Functions of the logarithm of the value of l e .We see from these that for all networks, the values observed for l e tend to be very small.Omitting the tails of the distributions for diminishingly small values of l e , we see a similarity in the values of l e observed across 3 similar equity datasets, and although across all 5 datasets analysed, the distribution is found to be approximately lognormal, the social network shows a much broader distribution of l e .The peak of the distribution for the college messaging dataset is also much lower, observed at approximately ln(l e ) = −8.8,whereas the bilateral trade dataset shows a peak at -3.3 , and the equity datasets at -3, -2.5 and -4.2.

Dynamic observations in real networks
We now address the central concept of the relationship of l e observed for our real networks and the probability of an edge to change.Figure 10 shows the distributions of the ln(l e ) values observed for non-changing edges in comparison to changing edges.We see that in all cases, there is a shift in the mean value of ln(l e ) towards higher values for edges which do change, which would be suggestive of a positive ρ parameter, and potentially the ability to predict the presence of changes given the value of l e .The smallest shifts are observed for the Bilateral Trade dataset and Equity-3, which show negligible differences in the mean and quartiles of the values of l e for changes and no changes, suggesting that we might not expect predictability of changes from the values of l e in these cases.In all cases, the differences in the mean values of l e for change vs. no change is significant, with a two-sided t-test showing p < 0.05 for all datasets.
To further understand how the value of l e relates to the probability for edges to change, we look at the distributions of P (∆A = 0|l e ) as shown in figure 11.Here we see a decreasing probability of ∆A = 0 for the bulk of the distribution for increasing l e for the bilateral trade and Equity-3 datasets, however the rarely observed edges with l e > 0.3 for these datasets show larger probabilities to remain unchanged.We again see a slight initial decrease for Equity-1 and 2 datasets however the relationship is clearly non-linear for large l e .The college messaging dataset shows a much larger probability in general for edges to remain unchanged, and shows a very slight decrease in probability to remain unchanged for very small l e values, however is dominated by noise for l e > 0.05.Referring back to section 4.2, we considered the ideal cases of linear positive, neutral and negative relationships between l e and the probability of edge changes.In reality, as shown in figure 11, we see things are more complex, with different relationships apparent for different l e ranges.In particular, for edges with lower values of l e , the negative relationship between the value of l e and the probability of an edge to remain unchanged suggests that a parameterisation of our model with positive value of ρ would be effective in capturing the behaviour of the bulk of the network.However changes to the small handful of edges with the largest values of l e are less likely.These observations could suggest that there are a few structurally important edges which act to stabilise a system which would otherwise move towards a regime of instability.

Estimation of α and ρ from data
In table 2, we present the values of α and ρ estimated for our 5 different datasets.The errors on these estimations are given by the inverse hessian of the Log-Likelihood, which is found by numerical approximation.In comparison with figures 5 and 11, we see the ordering of the estimated value of α appears to agree with the positions of the college messaging dataset and the equity datasets.The parameter ρ appears to correspond with the overall gradients observed in figure 11 for the bulk of the distributions observed for low values of l e .These observations suggest that our model is mostly capturing the imbalance of observed changes in the parameter ρ, and the overall average change probability for each dataset in the parameter α.   31 in the appendix shows the result of generating distributions of P (∆A = 0|l e ) for the estimated parameters, in comparison to the real datasets, restricted to l e < 0.3 in order to observe the bulk of the distributions.We see that the dataset generated according to the parameters estimated appear to show a reasonable agreement to the actual distribution, and differences here can be attributed to the differences in the initial network conditions.

Edge change predictability
Given the non-zero estimated values of the parameters α and ρ, it is natural to assess the performance of using the value of l e to predict a subsequent change.Figures 12 and 13 show the Receiver Operating Characteristic and Precision-Recall Curves for the 5 different datasets.All datasets are seen to perform slightly better than the dummy model, with better performance seen for the College Messaging dataset and Equity-1 and 2, which also show larger differences in the distribution of l e across change vs. no change in figure 6. Poorer performance is seen for the bilateral trade and Equity-3 datasets, which show similar shaped distributions in figure 11 with an initial steep decrease in probability to remain unchanged for increasing l e , however this trend appears to reverse for l e > 0.3.These datasets also show little difference in the distribution of values observed in figure 6

Relationship of between l e and size of weight changes
We now consider if the value of l e is observed to have an affect on the scale of subsequent edge changes.As in section 4.2, we again consider data generated according to the model in equation 6, and we choose to take U t ij = N (µ = 0, σ = βl γ e ).This introduces two new parameters, β which controls the width of the distribution of edge changes, and γ which controls the level to which l e influences the variance of the edge change distribution.

Variation of γ
Figure 14 shows the distributions of P (ln(1 + ∆A), , l e ) for a range of values of γ.We see here that for positive γ, the width of the distribution widens for larger l e .For negative γ, we see the opposite, that the width of the distribution becomes narrower for larger l e .

Weight distributions for real networks
We now consider the same 5 real datasets considered in section 4.2.5. Figure 16 shows the distributions of P (ln(1 + ∆A), ln(l e )) for the case of edges that do change, i.e. ∆A = 0 for the five real networks.Here ∆A refers to the relative change in the value of the edge weight from t 0 to t 1 , which takes values in the interval [−1, ∞], and l e is measured at time t 0 .Infinite values for ∆A, corresponding to the case of a new edge appearing, were observed but are not captured in the plots.The prominence of these across the different datasets are 4.7% of the bilateral trade dataset, 0.086% of the college messaging dataset, 0.012%, 0% and 0.0028% of the equity datasets 3 .We see a slight widening of the distributions for larger values of l e for Equity-1 and 2 datasets, and to a larger extent for the third equity dataset.The bilateral trade dataset shows initial widening as l e increases, however narrows again for the largest l e edges.The college messaging dataset shows two distinct peaks, corresponding to changes in edge weight of ±1, which are over-represented in this dataset as it is unweighted, and the edge weight solely represents the count of interactions in the time window of consideration.The slight widening for larger l e for all datasets is suggestive of a positive relationship between the value of l e and the variance of the distribution of subsequent edge changes.All 5 datasets show positive values of γ, suggestive of a relationship between the width of the distribution of edge changes and the value of l e .The dataset with the highest values for γ, the bilateral trade dataset dataset, also shows the largest level of bias towards larger change distribution width for higher l e in figure 16.Correspondingly, the lowest γ value is seen for the college messaging dataset, which shows the least bias towards larger changes occurring for larger values of l e .The values for β are similar across the 5 datasets, and all relatively low.It is difficult to draw conclusions from these, as the behaviours controlled by the two parameters cannot be separated and observed alone in the distributions in figure 16.

Discussion & Conclusion
The ability to understand how microscopic changes in networks affect the macroscopic evolution across time is one of the key challenges in dynamic network analysis.In this study we have begun to explore the use of derivatives of network spectra to capture this.We derive a measure of edge based structural influence, l e , and explore the extent to which the value is indicative of future changes.We first of all demonstrated that for small and isolated perturbations applied to the network, the eigenvalue derivative is approximated well by equation 3.However, we observe the approximation breaks down for multiple changes happening during the same time snapshot, suggesting that the measure may be more suited to a continuous or pseudo-continuous representation of network evolution, in which each time snapshot contains a single edge change.
Considering the 5 real datasets, we observe lognormal distributions of the values of l e , indicating structural influence dominated by a small handful of edges.We propose a model in which the probability for an edge to change is given by αl ρ e .This model allows us to control the extent to which l e dictates the propensity for an edge to change, and also controls the scale of a subsequent change.Focusing on the former, we observe similarities in the shapes of the distributions of P (∆A = 0, ln(l e )) when generating synthetic networks according to this model and those observed in the data, and the values observed for α and ρ are suggestive of a relationship between the value of l e and the subsequent presence of change.In using l e in a logistic regression classifier to predict change, we see that l e is slightly predictive of change in all cases, but only marginally so for the case of the bilateral trade and Equity-3 datasets.This corresponds with our observations of small values of ρ for these datasets, along with similar, non-linear distributions shapes for the probability of no change for increasing l e .These observations indicate that the static structural importance can be indicative of the presence of a subsequent change, however more work is needed to understand the shape of the distribution and the identification of different l e regimes.We will also consider taking a similar approach with other measures of edge importance, for example edge gravity [17].More work is also needed to understand the subsequent impact on the global network structure of an edge changing.It may be that a change influential edge could act to destabilise a system; conversely, the change could move the system towards a state of stability.We will look to investigate this in future analyses.
We note here that α and ρ themselves are useful parameters that could be used to classify networks according to their growth stability.A large value of α would be an indicator for larger levels of overall network activity.A network with very large ρ would be characterised by changes occurring to the nodes with the largest l e , conversely, a network with very small ρ would see changes distributed across all edges, regardless of the value of l e .In the context of financial markets, these contrasting situations would require different approaches, and ρ could be used by policy makers to inform which asset classes should be monitored as a whole (for the case of small ρ) or following an approach targeting those nodes with the highest l e .
Our model doesn't account for edges appearing and disappearing in the network, and assumes that edge changes are independent of each other.For the first limitation, we note that edge appearance and disappearance would be unlikely to heavily influence the behaviour of the Equity networks, as we observed very low percentages (0.012%, 0% and 0.0028%) of new edges appearing 4 , but for the other two networks this behaviour is much more prominent at 4.7% for the bilateral trade network and 0.086% for the college messaging network.The measure l e itself is able to assess the importance of an edge that subsequently disappears, and also those that appear between two existing nodes, so understanding how these appearances and disappearances can be captured in a model for network growth would be highly beneficial for future work.On the second point, we noted in our exploration of toy networks that the ability of our approximation of the eigenvalue derivative breaks down for multiple edge changes present.Conversely, many works such as Bandi et.al. have noted that predictability is aggregation scale specific.In future work we will thus investigate the trade-off between improved approximation of l e for the quasi-continuous limit in which each time snapshot contains a single edge change, and improved predictability for larger aggregation scales.In addition to this, further analysis is needed to assess the effectiveness of l e as an indicator for risk, as so far we understand that the value of l e bears some relationship to how the network subsequently changes, but we have not yet considered the resultant changes of edges with high values of l e , and how these have an effect on the rest of the network in terms of risk and stability.This is another area we will pursue in future work.We will also consider extending our methods to consider structural node importance, which is of use to policy makers who may wish to monitor which players could have an adverse impact on markets.It is also worth noting that although using raw transaction data gives us the lowest granularity view of the data, our work has so far not considered the higher order effects of trading behaviour on price.Such an effect results in the influence of edges reaching disconnected components, which cannot be captured by our methods, so we will consider generalising our methods to allow for networks with disconnected components.Finally, we will consider using our methods for classification of a large number of networks, and also extend our methods to understand the parameters which control the resultant weight changes.

Data availability
The datasets referred to as Equity-1, Equity-2 and Equity-3 in this paper, were extracted from a dataset of transaction reports collected by the FCA under MIFID II regulations.The datasets were used under agreement from the data owners at the Financial Conduct Authority for the current study, and are not publicly available.A data note describing the data is shown in appendix D, and comparable analysis is conducted for all investigations for two open source datasets (see below).
The dataset referred to as Bilateral Trade are hosted by Katherine Barbieri, University of South Carolina, and Omar Keshk, Ohio State University, available at http://correlatesofwar.org.The dataset referred to as College Messaging are available in the Stanford Large Network Dataset Collection repository, https://snap.stanford.edu/data/CollegeMsg.htmlAn implementation of the methods referenced in this paper can be found at [48].For the directed case, our perturbation is changing just a row (or column) independently, i.e.
Then, expanding the indices, Leading us to the result where λ M 0,i is the ith component of the eigenvector corresponding to the leading eigenvalue of M In both the directed and undirected case above, it is worth noting that the derivations can be generalised to allow new links to be added/removed, however new nodes cannot be added or removed.

C Kernel Density Estimation for conditional probability estimation
We make use of multivariate conditional Kernel Density Estimation (KDE) to find the probability distributions for the values of l e .The functional form of the KDE is where is the kernel function, a non-negative function.The parameter h is the bandwith, a smoothing parameter.We have used a Gaussian kernel:

D Dataset descriptions
The first dataset considered tracks bilateral trade flows between states from 1870-2014, describing import and export data in current U.S. dollars for pairs of sovereign states [46].This dataset is interesting not just due to its relevance to our focus on financial markets, but also due to an observed growth across time, apart from in two time periods corresponding to the First and Second World Wars.
The second dataset considered was a dataset of private messages sent on an online social network at the University of California.An edge (u, v, t) means that user u sent a private message to user v at time t.As this network is unweighted, the weights of all of the edges have been set to 1.The network was aggregated to daily snapshots, in which the edge weight is the number of times that edge is active during that day.
Finally, in order to observe the effects of different trading structures on the output of our methods, we applied our techniques to transaction reports relating to three different equity stocks traded on the UK capital markets.The data was aggregated daily, and covers a 2 year period from January 2018 5 .The Equity-3 dataset was analysed for the shorter time range of 03/06/2019 to 05/11/2019.We chose to study networks of transactions for stocks on energy companies due to the high level of trading activity in is sector.The results displayed in this paper consider the giant component networks of 3 different stocks.The first two instruments were traded without the presence of CCPs, one focusing on oil and gas exploration and production and the second focusing on renewable and alternative energy.The third instrument, another oil and gas production stock, shows a network dominated by the presence of a CCP.Due to the sensitivity of the data, these have been referred to as Equity networks 1, 2 and 3 throughout this paper.The Equity data was made available by the FCA for use in this study, and is not publically available.To provide the reader with additional context, here we include some high level network statistics for these networks.All statistics are based on the networks following the removal of nodes which appear on less than 5 days in the sample, which we classed as 'inactive'.
We see that all three networks have similar connectivities, but Equity-3 is significantly denser than the other two and shows a higher level of reciprocity.We can see from figures 26a and 26b that large transaction values are more likely to have a high reciprocity for the first and second dataset.However the same cannot be said for the third Equity dataset, as shown in figure 26c.
The evolution of high level network statistics are shown in figures 27,28, and 29.Here we see that the networks fluctuate around a relatively stable mean, with no obvious level of growth or decay across the time period.
It is further interesting to note that the third network considered shows the presence of a hierarchy in the network, due to it being an instrument that is traded mainly through the use of Central Clearing Parties (CCP), producing a tiered structure as shown in figure 30.Such a structure can be identified following the identification of a dominant node in the network, i.e. a node with significantly higher degree, and examining its ego network.

E Parameter estimations E.1 Estimation of ρ and α
We assume in this section that our networks can be described by a model in which the probability of an edge changing is given by αl ρ e ≤ 0 αl ρ e 0 < αl ρ e < 1 1 αl ρ e ≥ 1 The maximum likelihood estimate of θ then follows the same procedure as in the case of a (potentially biased) coin toss -given a sample of changes x i , the likelihood of observing these changes given θ is Since αl ρ e is constrained to be a probability, to estimate the parameters which result in the maximum likelihood, we need to minimise the negative log-likelihood with respect to multiple inequality constraints: 0 ≤ αl ρ e ≤ 1 (29) Where we have one inequality constraint for each l e .To do this, we make use of the Karush-Kuhn-Tucker conditions [45] and numerical optimisation, to find the optimal saddle point which maximises L with whilst satisfying these constraints.

E.2 Estimation of β and γ
For the case of the distribution of edge changes drawn from a Gaussian distribution with µ=0 and σ = βl γ e , the log-likelihood is given by ln(L) =

F Comparison of data distributions to model
Figure 31 shows the bulk of the distributions for P (∆A = 0| ln(l e )) for our 5 datasets, in comparison to the equivalent generated from our model for network evolution given by equation 6.

Figure 1 :
Figure 1: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e .Barbell graph, with equal initial weights

Figure 2 :
Figure 2: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e .Ring graph with each edge independently assigned a random integer between 1 and 10.

Figure 3 :
Figure 3: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e .Erdős-Rényi graph with each edge independently assigned a random integer between 1 and 10.

4. 2 . 1
Figures4 and 5show the resulting distributions for varying values of α.We see that an increase in α results in a decrease in the probability of an edge to remain unchanged for all values of l e , and for larger values of α, the rate of increase of change probability with l e is slightly larger.

Figure 4 :
Figure 4: Distributions of l e for edge changes vs. no changes, when varying α.

Figure 6 :
Figure 6: Distributions of l e value for the case of edge changes vs. no changes.

Figure 8 :
Figure 8: Improvement from dummy model of Precision-Recall AUC scores

Figure 9 :
Figure 9: Probability distribution of the values of log(l e ) for different networks.

Figure 10 :
Figure 10: Boxplots showing the distribution of l e values observed according to the presence or absence of an edge subsequently changing.

Figure 11 :
Figure 11: P (∆A = 0| ln(l e )) as a function of ln(l e ) for the 5 real datasets.

Figure 12 :
Figure 12: ROC curves for a logistic regression classifier making use of ln(l e ) to predict ∆A = 1.The dashed lines and shaded areas represent the mean 95% confidence intervals for the dummy model.

Figure 13 :
Figure 13: PR curves for a logistic regression classifier making use of ln(l e ) to predict ∆A = 1.The dashed lines represent the results for a stratified random allocation of labels.

Figure 16 :
Figure 16: Contours showing the distributions of P (ln(1 + ∆A), ln(l e ) for the 5 real datasets.Underlying observations of ln(l e ) and ln(1 + ∆A rel ) represented by the dots underlying these.

Figure 17 :Figure 18 :
Figure 17: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e .Weighted barbell network

Figure 19 :Figure 20 :
Figure 19: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e .Unweighted Erdős-Rényi network

Figure 22 :Figure 23 :
Figure 22: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e , for the case of two edges changing.The plots consider one of the two perturbed edges.Erdős-Rényi network with randomly assigned weights

Figure 24 :
Figure 24: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e , for the case of two edges changing.The plots consider one of the two perturbed edges.Ring network with randomly assigned weights.

Figure 25 :
Figure 25: Scatter plot of perturbations ∆A and the resulting ∆λ, compared to line of constant l e , for the case of two edges changing.The plots consider one of the two perturbed edges.Erdős-Rényi network with randomly assigned weights

Figure 26 :
Figure 26: Reciprocity vs. price for the Equity networks

L(x 1 Figure 27 : 3 Figure 28 : 2 Figure 29 :
Figure 27: Daily counts of nodes and edges, density and reciprocity across the entire investigation period for Equity-3

2 eFigure 30 :
Figure 30: Example of a tiered structure in the GWCC of the trading network for an instrument frequently traded via an individual CCP.

Figure 31 :
Figure 31: P (∆A = 0| ln(l e )) as a function of ln(l e ) for the 5 real datasets, overlaid with the distributions for data generated according to the model in 6

Table 1 :
DatasetCorr(l e , ∆A) Corr(l e , EBC) Corr(l e , deg n1 × deg n2 ) Corr(l e , S n1 × S n2 ) Spearman's rank correlations for l e with the rank by edge weight, edge betweenness centrality and product of nodes' degrees.

Table 2 :
Estimated α and ρ for the 5 real datasets Figure

Table 3 :
and are found to have low values of ρ.Although the college messaging dataset shows the best performance, particularly in the left hand side of the ROC curve, this is driven by the significant class imbalance with only 5% of the observations showing a non-zero ∆A, as opposed to the bilateral trade dataset which shows a 20% proportion of non-zero changes.Values of Area Under Curve scores for ROC and Precision-Recall curves.Numbers in brackets represent the score achieved by a model which randomly predicts 1 or 0 in proportion to the dataset prior.

Table 4 :
Estimated β and γ for the 5 real different datasets 3The prominence of new edges has been significantly reduced by focusing on the giant component

Table 5 :
All datasets show similar values for the correlation coefficient of the adjacency matrix.Network statistics for the three Equity datasets