We introduce the fastviz fastviz algorithm that takes in input a chronological stream of interactions between nodes (i.e., network edges) and converts it into a set of graph updates that account only for the most relevant part of the network. The algorithm has two stages: buffering of filtered network and generation of differential updates for the visualization (see Figure 1). The algorithm stores and visualizes the nodes with the highest strengths, i.e., the highest sum of weights of their connections.

### 3.1 Input

The data taken as input is an ordered chronological sequence of interactions between nodes. The interactions can be either pairwise or cliques of interacting nodes. For instance, the following input:

\u3008{t}_{i},{n}_{1},\dots ,{n}_{m},{w}_{i}\u3009

represents the occurrence of interactions between nodes {n}_{1},\dots ,{n}_{m} of weight {w}_{i} at epoch time {t}_{i}. Entries with more than two nodes are interpreted as interactions happening between each pair of members of the clique with the respective weight. Multiple interactions between the same pair of nodes sum up by adding up their corresponding weights. The advantage of the clique-wise format over the pairwise format is that the size of input files is smaller.

### 3.2 Filtering criterion

In the first stage of the algorithm, at most {N}_{\mathrm{b}} nodes with the highest strengths are saved in the buffer together with the interactions among them. The strength {S}_{i} of a node *i* is a sum of weights of all connections of that node, i.e., {S}_{i}={\sum}_{j}{w}_{ij}, where {w}_{ij} is the weight of an undirected connection between nodes *i* and *j*. Whenever a new node, which does not appear in the buffer yet, is read from the input, it replaces the node in the buffer with the lowest value of the strength. If an incoming input involves a node that is already in the buffer, then the strength of the node is increased by the weight of the incoming connection. To emphasize the most recent events and penalize stale ones, a forgetting mechanism that decreases the strengths of all nodes and weights of all edges is run periodically every time period {T}_{\mathrm{f}} by multiplying their current values by a forgetting factor 0\le {C}_{\mathrm{f}}<1. This process leads to the removal of old inactive nodes having low strength and storage of old nodes with fresh activity and high strength.

Note that the forgetting mechanism corresponds to a sliding time-window with exponential decay. The decay determines the weighting of the past interactions in the sliding time-window aggregation of a dynamic network. Standard rectangular sliding time-window aggregates all past events within the width {T}_{\mathrm{tw}} of the time-window weighting them equally. In contrast, in fastviz fastviz and in the sliding time-window with an exponential decay the weighting decreases exponentially (see Figure 2). (Under a set of assumptions one can calculate how much time will a given node stay in the buffered network. Let us assume that at the time {t}_{n} the strength of a node *n* is {S}_{n}({t}_{n}), that this strength will not be increased after time {t}_{n}, that the next forgetting will happen in {T}_{\mathrm{f}} time, and that the strength of the weakest buffered node {S}_{\mathrm{w}}<{S}_{n}({t}_{n}) is constant over time. Under these assumptions, the node *n* will stay buffered for time t-{t}_{n}>\frac{log({S}_{\mathrm{w}}/{S}_{n}({t}_{n}))}{log({C}_{\mathrm{f}})}{T}_{\mathrm{f}}.) Such exponential decay has two advantages over a standard rectangular sliding time-window approach. First, it gives more importance both to the most recent and to the oldest connections, while giving less importance to the middle-aged interactions. Second, it produces a dynamic network in which changes are smoother due to the balanced weighting of old and new connections. Finally, instead of using the sliding time-window with exponential decay, we introduced the fastviz fastviz algorithm to limit the computational complexity of network filtering. In principle, time-window methods do not introduce such a bound. We explore and confirm these points in the following subsections using real dynamic networks.

### 3.3 Filtering criterion versus rectangular and exponential sliding time-windows

Comparison of structural properties of networks produced with different filtering methods is not straightforward. First, since the networks are dynamic, one needs to compare the structural properties of the static snapshots of the networks produced by the two methods at the same time. Second, parameters of the methods, i.e., forgetting factor {C}_{\mathrm{f}} and time-window width {T}_{\mathrm{tw}}, influence the algorithms, so one needs to draw an equivalency between them to compare the methods under the same conditions. A natural condition to consider is the one of equal areas under the curves from Figure 2, representing the contribution of an interaction event to the representation of a node over time. Note that under this condition a node with constant non-zero activity in time will have the same strength in networks created with each method. For fastvizfastviz, the area {A}_{\mathrm{fv}} under the aggregation curve is equal to the sum of a geometric progression. Assuming an infinite geometric progression, we get the approximate {A}_{\mathrm{fv}}={T}_{\mathrm{f}}/(1-{C}_{\mathrm{f}}). The area under the aggregation curve of the rectangular time-window is simply {A}_{\mathrm{tw}}={T}_{\mathrm{tw}}. By demanding the areas to be equal, we obtain the relation between the parameters of the two methods

{T}_{\mathrm{tw}}=\frac{{T}_{\mathrm{f}}}{1-{C}_{\mathrm{f}}}.

(1)

In general, the forgetting period {T}_{\mathrm{f}} is fixed, therefore there is only one free parameter controlling the filtering, e.g., the forgetting factor {C}_{\mathrm{f}}, which we assign according to the dynamic network, i.e., the faster the network densifies in time, the more aggressive forgetting we use (see Appendix B for more details about the values of parameters). In the following paragraphs, we analyze the dynamics of several structural properties of the networks produced with fastvizfastviz, rectangular, and exponential sliding time-window methods having equal aggregating areas.

To highlight the differences between the three filtering methods, we apply them to two real dynamic networks from Twitter characterized by high changeability and measure the structural properties of resulting networks (Figure 3). The networks represent interactions in Twitter during two widely popular events: the 2013 Super Bowl and the announcement of Osama bin Laden’s death. Further description and properties of these datasets are provided in the next section.

Due to this fact the computational complexity of sliding-time window methods increases in time, whereas it is bounded in fastvizfastviz. Since network structural properties such as average degree and clustering depend on the size of the network, we calculate these properties for the subgraphs of equal size, i.e., for the {N}_{\mathrm{b}} strongest nodes of the full network produced by each of the sliding time-window methods (Figures 3C-J). For simplicity, we refer to these subgraphs of {N}_{\mathrm{b}} nodes as the buffered networks.

Second, we find that the networks produced with our filtering method do not experience drastic fluctuations of the global and local clustering coefficients and degree assortativity, which are especially evident for the rectangular time-window (Figures 3E, G, H, and I). We conclude that the fastviz fastviz filtering produces smoother transitions between network snapshots than rectangular sliding time-window. This property of our method may improve readability of visualizations of such dynamic networks.

Finally, fastviz fastviz captures persistent trends in the values of the properties by leveraging the short-term and long-term node activity. For instance, it captures the trends in degree, clustering coefficients, and assortativity that are less visible with the rectangular time-window, while they are well-visible with the exponential time-window (Figures 3C-F, I, and J). Note that high average degree obtained for networks produced with exponential time-window corresponds to the nodes that are active over a prolonged time-span, whose activity is aggregated over unbounded aggregation period, and the number of nodes is unbounded as well. On the contrary, rectangular sliding time-window shows the degree aggregated over a finite time-window, while fastviz fastviz limits the number of tracked nodes, leading to lower reported average degree.

To measure the similarity of sets of nodes filtered with different methods we calculate Jaccard similarity coefficient. Specifically, we measure the Jaccard coefficient *J* of the sets of {N}_{\mathrm{b}} strongest nodes filtered with fastviz fastviz and each of the time-window methods (Figures 4A and B). The value of the coefficient varies in time and among datasets. However, the similarity between fastviz fastviz and exponential time-window is significantly higher than between fastviz fastviz and rectangular time-window. For the Super Bowl dataset, the similarity between fastviz fastviz and exponential time-window is close to 1 most of the time and has a drop in the middle. The drop corresponds to the period of the game characterized by the intense turnout of nodes and edges in the buffered network. Hence, the similarity is not equal to 1 for the two methods because the weakest nodes are often forgotten and interchanged with new incoming nodes in fastvizfastviz, while in exponential time-window method they are not forgotten and can slowly become stronger over time. In the next subsection we show that this similarity is close to 1 at all times for the subsets of strongest nodes selected for visualization.

### 3.4 Network updates for visualization

In the second stage, for the purpose of visualization, the algorithm selects {N}_{\mathrm{v}}<{N}_{\mathrm{b}} nodes with the highest strength and creates a differential update to the visualized network consisting of these nodes and the connections between them. Each such differential update is meant to be visualized in the resulting animation of the network, e.g., as a frame of a movie.

We compare the visualized networks generated by each of the filtering methods. Each of the visualized networks consists of {N}_{\mathrm{v}}=50 strongest nodes and all connections existing between them in the buffered network. The similarity of the nodes visualized by the fastviz fastviz and exponential time-window methods, measured as Jaccard coefficient *J*, is 1 or close to 1 (Figures 4C and D). The visualized networks of the two methods are almost identical. The structural properties of the networks created with the two methods yield almost the same values at each point in time (Figures 5A-J). This result is to be expected, since the forgetting mechanism of fastviz fastviz corresponds closely to the exponential decay of connection weights. The advantage of our method over exponential time-window consists of the limited computational complexity, which makes the fastviz fastviz filtering feasible even for the largest datasets of pairwise interactions. Naturally, the similarity between visualized networks created with the two methods decreases with the size of the visualized network {N}_{\mathrm{v}} (Figures 4E and F). More specifically, the similarity decreases with the ratio {N}_{\mathrm{v}}/{N}_{\mathrm{b}}, as we keep in our experiments a constant value of {N}_{\mathrm{b}}=2\text{,}000. Hence, to visualize larger networks one can choose to buffer more nodes.

The comparison of the evolution of structural properties of the corresponding buffered and visualized networks shows that these networks differ significantly for each of the filtering methods (compare Figure 3 vs. Figure 5). This difference is the most salient in the case of rectangular time-window, which yields considerably larger fluctuations of structural properties than the other methods. In the cases of fastviz fastviz and exponential time-window some structural properties show evolution that is qualitatively similar for buffered and visualized networks, e.g., the average degree and the global clustering coefficient (Figures 3C-F vs. Figures 5C-F). We conclude that the structure of visualized network differs significantly from the structure of buffered network, although this difference is smaller for fastviz fastviz than for rectangular sliding time-window.

### 3.5 Computational complexity

The computational complexity of the buffering stage of the algorithm is \mathcal{O}(E{N}_{\mathrm{b}}), where *E* is the total number of the pairwise interactions read (the cliques are made of multiple pairwise interactions). Each time when an interaction includes a node that is not yet stored in the buffered graph the adjacency matrix of the graph needs to be updated. Specifically, the weakest node is replaced with the new node, so {N}_{\mathrm{b}} entries in the adjacency matrix are zeroed, which corresponds to \mathcal{O}(E{N}_{\mathrm{b}}). The memory usage scales as \mathcal{O}({N}_{\mathrm{b}}^{2}), accounting for the adjacency matrix of the buffered graph. (For certain real dynamic networks, the buffered graph is sparse. In such cases, one can propose more optimized implementations of fastvizfastviz. Here, we focus on limiting the time complexity so that it scales linearly with the number of interactions and describe the generic implementation that achieves it.) The second, update-generating, stage has computational complexity of \mathcal{O}(U{N}_{\mathrm{b}}log({N}_{\mathrm{b}})), where *U* is the total number of differential updates, which is a fraction of *E* and commonly it is many times smaller than *E*. (Typically, a large number of interactions is aggregated to create one differential update to the visualized network. In the examples that we show in the next section, one update aggregates from 400 to 2 million interactions. Therefore, *U* is from 400 to 2 million times smaller than *E*.) This term corresponds to the fact that the strengths of all buffered nodes are sorted each time an update to the visualized network is prepared. The memory trace of this stage is very low and scales as \mathcal{O}({N}_{\mathrm{v}}). We conclude that our method has computational complexity that scales linearly with the number of interactions. It is therefore fast, that is, able to deal with extremely large dynamic networks efficiently.