A bi-directional approach to comparing the modular structure of networks

Here we propose a new method to compare the modular structure of a pair of node-aligned networks. The majority of current methods, such as normalized mutual information, compare two node partitions derived from a community detection algorithm yet ignore the respective underlying network topologies. Addressing this gap, our method deploys a community detection quality function to assess the fit of each node partition with respect to the other network's connectivity structure. Specifically, for two networks A and B, we project the node partition of B onto the connectivity structure of A. By evaluating the fit of B's partition relative to A's own partition on network A (using a standard quality function), we quantify how well network A describes the modular structure of B. Repeating this in the other direction, we obtain a two-dimensional distance measure, the bi-directional (BiDir) distance. The advantages of our methodology are three-fold. First, it is adaptable to a wide class of community detection algorithms that seek to optimize an objective function. Second, it takes into account the network structure, specifically the strength of the connections within and between communities, and can thus capture differences between networks with similar partitions but where one of them might have a more defined or robust community structure. Third, it can also identify cases in which dissimilar optimal partitions hide the fact that the underlying community structure of both networks is relatively similar. We illustrate our method for a variety of community detection algorithms, including multi-resolution approaches, and a range of both simulated and real world networks.


Introduction
The last thirty years have been extremely fruitful in the study of networks, which enable us to represent and analyse the complex interconnection structure of a wide range of real world and engineered systems [42,43,62]. Such is the versatility and usefulness of such a representation, network analysis is increasingly popular in fields as diverse as physics, biology and sociology (see e.g., [1,14,27,28,61]). While the networks toolbox continues to expand, particularly with respect to dynamicsbased approaches [3,53] and statistical methods [6,56], there remains a range of methodological challenges related to common applications. Seeking to fill an apparent gap, here we develop a new method to address the simple and widely applicable question of the comparison of the community structure of two networks with an identical set of nodes.
A node partition into communities corresponds to a division of nodes into nonoverlapping sets such that each set of nodes is tightly interconnected with few connections to the rest of the network [18]. The presence of such communities is typically linked to the presence of some kind of higher-order organisation in a network, and is often related to functional or societal structure [54]. In many cases, a network may be reduced to a community-based representation which retains the fundamental structural and dynamical features of the original network [16,35]. There exist a wide variety of methods to detect community structure, as reviewed by Schaub et al [59].
Here we focus on developing a new method to compare the modular structure of networks. This is key to, for example, understanding differences in brain function in two neural networks, or understanding how partisanship has changed over time in the United States Congress. Since the presence of communities in these networks lies at the core of the question of interest, it is natural to use them as the basis of comparison. Perhaps the most widely-used community comparison technique is normalized mutual information (NMI) [8]. This and other similar techniques are derived from an information theoretic approach to comparing sets. By focusing exclusively on partitions, and ignoring all other features of the network (for example, the strength of the connections between different communities), these methods lose information. Furthermore, while mutual information is symmetric, there are frequent cases where one network provides useful information about the other but not vice-versa. For example, if the communities of one network are nested inside the communities of the other (meaning that one community of the second network contains several communities of the first) then the second network provides information about the structure of the first but not the other way around.
We propose a methodology that addresses both of these issues, and that is agnostic to the use of community detection algorithm within a large class of commonly used approaches. We call it the bi-directional distance, and it is based on a simple idea: we swap the partitions representing the communities of the two networks and evaluate whether the re-assigned communities are a good fit. In the sections below we demonstrate this method for a range of toy and synthetic networks, as well as a real-world example concerning inter-industry labour flows relevant for industrial policy. Labour flow networks illuminate the labour mobility structure of an economy, which is an important factor in the transport of industrial know-how and the localised emergence of new economic activities at the heart of regional economic growth policies.
The paper is structured as follows. We begin by giving a short overview of both the field of community detection and current network comparison methodologies, while also illustrating the shortcomings of current community comparison techniques. We then present our new methodology and the results of experiments on toy and synthetic networks. Second, we demonstrate the flexibility and wide applicability of our method using three distinct community detection algorithms. Third, we apply our methodology to compare the inter-industry labour flow networks of 5 European countries, Germany, Sweden, the Netherlands, the UK and Ireland. Finally, we extend our methodology to the more complex case of multi-scale community detection.

Literature Review
Community detection Given the diversity of community detection algorithms, it is useful to classify approaches according to the nature of the problem at hand. Schuab et al [59] propose four broad categories of community detection objective: community detection as the minimization of some constraint function, community detection as clustering into densely connected groups, community detection aimed at identifying structurally similar nodes and community detection as the simplified description of dynamical flows on the network. We follow this categorization to briefly introduce some of the most popular community detection algorithms.
In the first category we find cut based approaches which aim to minimize the number or weight of the edges between communities. In other words, the constraint to minimize is calculated as the number of edges that must be deleted (or cut) to achieve the partition, e.g., "ratio-cut" [20]. Graph partitioning is heavily used in parallel computing, in the layout of circuits, and to design serial algorithms to solve partial differential equations [16].
In the second category we find, among others, modularity optimization [45] which is one of the most popular community detection algorithms. A key difference with cut based approaches is that modularity does not a priori set the number of clusters, nor constrains them to be of similar size. Although not without its limitations, this approach has seen widespread use and given rise to a large number of variations and adaptations [16]. It has been used to investigate protein interaction networks, functional brain networks, and ecological networks among other applications.
The structural category groups nodes that exhibit similar connectivity patterns, that is, they are similar if they connect to similar nodes with equal probability . One popular technique for identifying groups in this manner is the stochastic block model (SBM) [24]. This approach is based on the assumption that nodes can be divided into certain classes, and that the likelihood of a connection between two nodes is determined by their respective classes. This gives rise to a probabilistic model for the network such that, via identification of the best fitting parameters, latent groups in the network can be recovered. Originating in the social networks literature, this approach is frequently used to uncover sub-groups in a social network.
The last category deploys dynamics on networks to uncover modular structure [31,42]. In this category we find Markov stability [9] and InfoMap [57]. Markov stability, for example, exploits random walker dynamics on a network to detect the presence of node communities. If a walker, which jumps from node to node with probability proportional to edge weight, gets 'trapped' in a region for a period of time this indicates a cluster of nodes with high internal connectivity. This approach has been applied to dynamical processes on networks such as consensus and synchronisation (e.g., [2,50]).
In general, as we have seen, different types of application call for different approaches to detecting community structure. Furthermore, no particular methodology consistently outperforms others [55]. Hence, any network comparison technique that focuses on the comparison of modular structure will need to accommodate a variety of underlying community detection techniques.

Network comparison methods
There is a rich literature in network comparison [13] spanning from global methods (e.g., spectral distances) to more localised approaches (e.g., graph editing). Since we are interested in comparing labelled networks, where each node represents a particular instance (a person, a place, etc.), we focus on methods where there is a one to one correspondence between the nodes of both networks.
Perhaps the simplest approach is to count the proportion of edges that need to be added and deleted to transform one graph into the other, a measure which originated in the field of information theory and is known as the Hamming distance [21]. While easy to interpret, the Hamming distance is highly sensitive to the density of the graphs, a shortfall that is addressed by the Jaccard distance [25]. The simplicity of these methods has seen them become very popular, not just in the network comparison literature, but also in the fields of information theory and machine learning [13]. Nevertheless, they are very local in scope, looking only at the direct neighbours of each node, with all deletions and insertions given equal weight regardless of changes in properties such as node importance as captured by various centrality measures. Therefore, these methods are not well suited to capture similarities at medium or large scales.
Motif based methods, which compare the frequency of certain sub-graphs (motifs) such as triangles or stars, have proved very popular in applications to protein interaction networks [33]. Despite being relatively local in scope, they are able to capture patterns beyond immediate neighbourhoods. This technique is particularly useful when the motifs represent functional components of the network. However, just as for graph editing distances, these methods fail to capture similarities in large scale connectivity or organisation patterns across networks.
On the other end of the spectrum we have spectral distances. These are global measures based on the eigenvalues of either the adjacency matrix or some version of its Laplacian. Since the eigenvalues are related to topological properties of the network, such as connectivity, they capture global features of the network. Nevertheless, these approaches are permutation invariant, and thus ignore the labels of the nodes which renders them impractical for many applications (consider a network of friendships, the spectral distance does not differentiate between the original network and one obtained by randomly exchanging individuals).
There are other approaches, both global and local, beyond those that we have outlined above but there is comparatively little work comparing networks at a mesoscopic or modular scale. By far the most widely-used community-based comparison technique is Normalized Mutual Information (NMI) [8]. NMI, which has been borrowed from the field of information theory, defines the similarity of two partitions as the mutual information of the two partitions, normalized by a combination of the Figure 1 Here we show five networks whose nodes are coloured according to their communities found using modularity. We illustrate the different behaviours of three network comparison methodologies: the Jaccard distance (graph editing), the Spectral distance (eigenvalues) and the NMI distance (partitions). Sub-figure F shows the ranking of these similarity metrics for network pairs from closest (top) to furthest (bottom). Notice that standard community comparison metrics like NMI do not differentiate between pair A-B and pair A-C since they ignore the underlying structure and consider exclusively the partitions. entropy of each partition. It therefore measures how easily one can infer one partition, given the other. Modifications of NMI have been proposed to address some of its limitations, namely that it can do very poorly when the number of communities in two networks differ [46,48]. Figure 1 illustrates some key network comparison methodologies. We have chosen three popular methodologies: the Jaccard distance, the Spectral distance, and NMI. While the Jaccard distance considers B to be the most similar to A, the Laplacian ranks C as more similar, and NMI indicates that E is the closest network to A. It is clear that these metrics capture different features of the networks. We are interested in comparing the community structure, which is shown via the colouring of the nodes in the figure. We observe that E is mostly embedded in A and therefore quite similar. This is picked up by NMI. Comparing B and C to A, it appears that C is 'closer' since the node that changes community is relatively peripheral to both the blue and yellow clusters. However, since NMI only uses information on partitionsand ignores the strength of community as well as the role of the nodes -it does not differentiate when comparing A to B or C.
Although the above methodologies for comparing networks based on community structure have proven to be useful, the fact remains that there are a number of shortcomings. Perhaps the most prominent issue is that NMI and its modifications rely exclusively on partitions of network, and ignore the quality of these partitions. By reducing the community structure to a partition, these methodologies discard important information about the network [49]. There have been some efforts to address this issue [5], but they rely on correction factors (usually weighting the nodes by their degree). Furthermore, we described earlier how different community detection algorithms differ in their objectives, but partition based methodologies do not account for these differences.
In this paper we propose an alternative framework to comparing networks: partition swapping. This approach presents three marked advantages: it is simple to understand and compute, it incorporates information beyond the partition of the network, and it is flexible, adapting to different community detection algorithms. Its simplicity comes from the underlying idea: for a pair of networks, we assess the fit of each node partition with respect to the other network's connectivity structure. Since the quality function takes into account the edge weights in order to evaluate the fit, we include information about the network topology. And by changing the quality function we immediately adapt to the application at hand.

Inter-industry labour flow networks
One of the initial motivations for the development of a new modular network comparison method was to compare the community structure of several industry networks from different countries. The edges of these industry networks are derived from worker data on inter-industry job switches. The extent to which workers jump between industry pairs corresponds to the degree of skill-overlap, a key metric for the 'relatedness' between industries which underlies network-based modelling of regional growth paths [17,38]. This type of network was first introduced by Neffke et al [36] and is known as a skill-relatedness network (SRN) in the literature. A key feature of these networks is their modular structure [47], which constrains regional diversification paths and knowledge diffusion. One might expect that these networks exhibit a near-universal structure, with skill-based industry clusters conserved across countries. On the other hand, differences in structure may illuminate potential growth paths and opportunities unseen in models based solely on data from a single country. Here we investigate these questions, seeking to deploy our new methodology to illuminate both similarities and differences across networks.
Industry networks where nodes are connected based on some kind of skill or capacity overlap have emerged as a key modelling tool in evolutionary economic geography.
The key idea is that regions are constrained by the capabilities and know-how embedded in their workforce and industrial structure [17,22,40]. In order to diversity and grow, regions tend to move into new industries that share similar or 'related' capabilities in an incremental path dependent fashion [23,36,38]. In order to model this process, a range of different types of industry networks capturing the 'relatedness' between industry pairs have been proposed. These include, amongst others, geographic clustering or co-location of industries as a proxy for general capability overlap [22], occupational similarity as a proxy for labour sharing [15], and collaboration on patents as a proxy for knowledge sharing [26]. Here we focus on 'skill-relatedness' which was proposed by Neffke et al [37]. Intuitively, if many workers move from one industry to another, then it is likely that these industries share a high degree of skill-similarity. This metric is relatively less noisy and has more even industry coverage relative to other alternatives.
Network-based models for regional growth and diversification paths broadly work as follows. We can think of the 'location' of a region in an industry network in terms of the collection (subgraph) of industries it currently has. Potential new industries share existing skills and capabilities, and correspond to neighbouring nodes in the network. Hence, if a region has many industries on the periphery of the network, it has few related industries into which it is able diversify. However, if a region has industries which are more central and connected, it has many more potential diversification paths. Skill-relatedness networks have been applied to model and predict the diversification paths of regions in a large number of contexts [36,39,52]. They have also been used in various other applications, such as models for urban formality growth [50,51], employment resilience and adaptability [12] and knowledge spillovers from MNEs [7].
Recently, O'Clery emphet al [52] showed that these networks have a distinct modular structure. Hence, they contain communities (i.e., groups of industries) within which workers switch much more easily (relative to switches between industries in different communities). The authors refer to these communities as 'skills-basins' in reference to the degree of skill and knowledge sharing within industry groupings. The presence of modular structure is a key topological feature of these networks that constrains the industrial diversification opportunities of regions, and can be used to improve network-based models for regional diversification and growth patterns.
In this study, we are interested in comparing the modular structure of various countries' SRNs. Within the related diversification literature, the SRN has been assumed to be identical across both space and time. Neffke et al. [39] showed that there was little variation between the SRN when constructed for East and West of Germany, as well as for Germany compared to Sweden [36]. The authors estimated the similarity between the two networks by calculating the Spearman correlation between edges. Based on these results, various studies have used the SRN of a different country within their analysis when the SRN of the country under consideration was not available.
We believe, however, that these networks may be different, particularly on a mesoscopic-scale (i.e., with respect to their modular structure). As an SRN is constructed from all inter-industry labour flows within a country, the network reflects intricacies of the structure of the local labour market. We expect that the degree of inter-industry labour flows is highly dependent on the historical economic progress of a country and its institutional labour market structures. We are interested in both similarities and differences in the modular structure of these networks. Similarities uncover universal structure in inter-industry skill-sharing, and portability of networks across contexts. Differences may provide insight into hidden growth opportunities: linkages and clustering patterns present in one context may suggest potential unseen paths in another.

Results
A bi-directional distance metric A large class of community detection algorithms are based on optimizing an objective function F that measures the goodness of fit of a partition according to some desired property, whether structural (for example modularity from [44]), dynamic (see [9]) or other (see [58]). We propose to compare the modular structure of two networks, say A and B, by computing the ratio of A's F -score under B's optimal partition to its F-score under its own optimal partition, and vice versa. In a sense, our two dimensional method tells us how well network A describes modular structure of B as well as the other way around.
More formally, let A, B ∈ M n×n be the adjacency matrices of two networks of size n, where there is a one to one correspondence of the nodes in each network given by the identity function. In other words, the labelling of the nodes matter and we assume that the labels of both networks are identical. Let P be the space of partitions of [n], the set of integers up to n. Without loss of generality, let F : M n×n × P → [0, 1] be an objective function. We then denote the partitions that maximize the objective function for each network. We are now ready to define our (non-symmetric) distance score: The ratio on the right tells us how well network A describes modular structure of B -a high value indicates that the optimal partition of B is indeed also a good partition of network A. By construction, d(A, B) ranges over [0, 1], and is 0 if and only if F (A, P A ) = F (A, P B ). By swapping A and B we can obtain a second distance score. In this way, for every pair of networks, we have a pair of distance scores that reflect how well their respective partitions capture each others' community structure. We propose a two dimensional distance, the partition swapping score BiDir : Hence, for any pair of networks A and B, we compute a two dimensional or bidirectional distance score in order to compare their modular structure.
This method has three distinct advantages over previous community or partition comparison methods. First, it can be adapted to any community detection algorithm that is based on optimizing an objective function. Second, it goes beyond simply comparing the resulting partition arising from a community detection algorithm but accounts for the underlying network structure, specifically the strength of the connections within and between communities. This means, for example, that our method can capture differences between networks with similar partitions but where one of them might have a more defined or robust community structure. Third, it can also identify cases in which very dissimilar partitions hide the fact that the Figure 2 Using the same networks as Figure 1 (shown in the inset of subfigure A), we compare the same pairs using the BiDir distance, shown in A. We notice that distance between the pair A-C is small, as demonstrated by a BiDir value very close to the origin and identity line. Similarly, if we project the communities of A onto network E, we obtain a very good fit and a near zero distance (as shown by the point on the x axis). This is a consequence of the fact that the communities of E are a refinement of the ones in A. This does not hold true in the other direction, i.e., if we project the communities of E onto A, as shown by a non-zero distance of the point to the y axis. On the right, in B and C, we can see how rankings derived from other metrics, such as NMI and spectral distance, do not fully capture these features. For NMI, the A-E pair is similar, but not so for the spectral distance.
community structure of both networks is relatively similar. A network might be well suited to multiple distinct partitions -for example, when there are nested communities -all of which will score highly in the objective function. Unlike other methods, our framework will pick up this signal.
In Figure 2 we illustrate the key features of the BiDir distance using the toy networks from Figure 1. We use modularity optimization as the community detection algorithm. As outlined above, we obtain not one but two values for each pair of networks, which are plotted against each other in two-dimensional space in sub-figure A.
The first thing to notice is that the two components of our distance convey distinct pieces of information. Points that are positioned away from identity line (diagonal) correspond to asymmetric behaviour, whereby one of the swapped partitions scores higher than the other. We can interpret this a case where the optimal partition of the first network is well-described by the modular structure of the second. The optimal partition of the second network, however, is not as compatible with the first. For example, we can see that the point corresponding to pair A-E is very close to the x axis. This indicates that if we project the communities of A onto network E, we obtain a very good fit. In fact, since the distance is near zero, the partition found on E is virtually as good a description of A's community structure as the communities found on A. Looking at the other direction, whereby we project the communities of E onto network A, do we not find as good a fit. By eyeballing the relevant networks, shown in the inset, we notice that the communities of E are a refinement of the communities of A, and thus we can think of them as being nested in the communities of A. This explains why the distance d(E, A) is so small.
The second key observation is that our approach implicitly takes into account the network's connectivity patterns, and not just the partition, by making use of the objective or quality function. Notice that, unlike NMI (or like any other purely partition-based methodology), we differentiate between pairs A-B and A-C.
They both are equally distant from the x axis, and so the optimal partition of A, when projected onto B or C, results in more or less the same modularity score. The community structure of C, however, scores better than B when projected onto network A. Again, looking at two networks we can understand why this is the case.
In the former case, when we project the partition of A onto network B and C, the partition misplaces a single node (relative to the optimal partition of B and C). In the latter case, when we project the partition of B and C onto network A, we see that while both the partitions derived from B and C differ from A by a single node in the first case a node that is central to the blue community in A changes groups, while in the second it is the most peripheral node that does. Hence, the distance between A and C is smaller.
This simple example shows how BiDir, by exploiting the objective function of the community detection algorithm, captures information on connectivity structure underlying the community structure of both networks, and is therefore capable of distinguishing cases which NMI and related methodologies would not be able to discriminate.

Synthetic networks
Having introduced the BiDir distance, we will now construct several families of networks drawn from a stochastic block model (SBM) to further illustrate its features and test its performance. Specifically, we probe the effect of the strength of the community structure, and nestedness and overlap of communities, on the BiDir distance.
Stochastic block models provide a good generating model for networks with a clearly defined community structure. First introduced to study the structure of social networks [24], they have been widely adopted because they provide a straightforward way of generating modular networks as well as a simple benchmark for testing the statistical significance of communities. In a classical stochastic block model, n nodes are partitioned into a set of k communities C = {C 1 , ..., C k }. A matrix P of k × k encodes inter-community connection probabilities (the probability that nodes v ∈ C i and u ∈ C j share an edge is given by P ij ). Hence, we can directly specify the number of communities as well as the likelihood of the connection within and across communities.
It is possible to use SBMs to generate networks that exhibit a nested community structure. In this case, smaller communities are nested inside large communities. This is type of a structure that we observe in many real world networks. For example consider scientific collaboration networks: at a small scale we might identify individuals that belong or belonged to the same institution or research group, but Figure 3 We illustrate our framework using three different families of SBMs. In family A all five networks contain the same communities, but the strength of in-group ties is sequentially decreasing. In subfigure D we see that the distances are zero whenever the original communities are detected, and increase with the weakening of the community structure. In family B each successive SBM is a refinement of the previous one, with blocks splitting along the way. So while B1 has two large communities, B3 has 4 communities, two for each of B1's communities. Notice that in subfigure E, the community structure of B1 describes well the structure of B2-B5, but that doesn't hold in the other direction. In the last family of SBMs C1 has three communities while C2 and C3 have four identical communities. But for C2, two of these communities can also be merged into one larger community. In subfigure F we can see that, while C2 and C3 have an identical optimal community structure, our method picks up that C1 is closer to C3 than to C2 because C3 can also be partitioned into 3 communities.
these groups might also work together within a country or region, creating two levels of modular structure.
We generate three families of SBM, as illustrated in Figure 3. Full details on the parameters used to generate these ensembles is provided in the SI. As before, modularity is used as the community detection algorithm. The distances shown correspond average BiDir distances over 1000 instances of each SBM.
The first family has four SBMs, all with the same community structure but with successively decreasing probability of in-group ties. As long as the communities can be recovered, their structure remains intact and thus the distance between them will be 0. Once the communities can no longer be properly identified by our community detection algorithm, the distances increase. We see in subfigure D that when we project the community structure of A1 onto A2, A3 and A4, we obtain values close to zero (i.e., the points lie along the x axis). As expected, A1's community structure is very close to being optimal for A2, A3 and A4 as well. However, if we instead project the community structure detected for A2, A3 and A4 onto A1, the distance progressively increases as the community detection algorithm struggles to find the underlying or generating structure. The final SBM A5, where inter and intra community connection probabilities are equal, does not have a community structure and is thus most distant from A1.
The second family is composed of five SBMs, each a refinement of the previous one. In other words, B1 has two communities, B2 has three communities, obtained by splitting one of the communities of B1 into two, B3 has four communities and so on. We compare the first SBM B1 to the other four. We observe, as we would expect, that as we split the communities further, we increase the distance from the original network. More specifically, if we project the community structure of B1 onto B2-B5, we again obtain values close to zero (i.e., the points lie along the x axis). Hence the original partition B1 remains informative even after a few splits since the new communities are nested inside the original one. Looking in the other direction, however, the more dis-aggregate partitions describe progressively less well the structure of B1.
Our last family of SBMs is slightly more complicated. C1 has three communities, but none of them maps onto the communities in C2 or C3. C2 and C3 are both split into four identical communities, but for C3 two of these communities are also strongly connected between them, and thus C3 can also be reasonably partitioned into three communities. In subfigure F we can see that the D(C2, C3) = (0, 0)since the partition of C3 into four communities has higher modularity than the partition into three, C2 and C3 the same optimal community structure. But when comparing them to C1, we see a different picture. C3 is closer to C1 because the middle community of C1 is close to the combined middle block of C3. C2 is further from C1 as the middle community of C1 overlaps two very distinct blocks in C2. We see that BiDir distance rightly identifies that although C2 and C3 have the same optimal partitions, their underlying community structure is markedly distinct.
Hence, even when the optimal partitions of two networks appear to exhibit little overlap, our method will generate a low distance score when there exists an alternative and almost optimal partition that is similar. It is well known [16] that an exhaustive optimization of modularity is impossible due to the high number of possible partitions, and furthermore, very different partitions might obtain similar near optimal scores. Therefore looking exclusively at the resulting partitions is likely to miss important structural similarities that are nevertheless captured by the BiDir distance whenever these two distinct partitions produce similar scores for the optimization function.

Application to inter-industry labour flow networks
In this section, we compare the modular structure of five European Skill-Relatedness Networks (SRN) using the BiDir distance. Recall that, although it has previously been argued that the SRN is universal, we expect to find variation across the modular structure of different countries' SRN. Understanding this variation will unveil how structural characteristics of labour markets impact on industrial diversification paths.
An SRN is constructed as follows. Two industries are considered skill-related if there is a larger number of job switches between them compared to what would be expected at random. Formally, if φ ij is the number of job switches between industry i and j within a given time period, then the skill-relatedness is given by The matrix is then made symmetric by averaging with its transpose and re-scaling the values so that they range from −1 to 1 to give the adjacency matrix: Note that values that are greater than 0 indicate that the number of job switches is greater than what would be expected at random under a null model (specifically, the configuration model [34]). We therefore conserve only positive values of this matrix. The reader is referred to Neffke et al [36] for a detailed discussion of these methodological considerations.
Here we compare the SRNs for Germany [39], Ireland [52], the Netherlands [11], the United Kingdom and Sweden [37]. In all cases, the industry classification corresponds to the 4-digit NACE 1.1 industry classification [1] . The graph intercept of these networks (with the exception of the UK, see footnote below) is used to ensure that all networks have the same size and are node-aligned -a requirement of the BiDir metric.
In Figure 4 (A) we show a visualisaton of the Irish SRN from O'Clery et al [52]. Each node in the network represents an industry and each edge the skill-relatedness between its two corresponding industries. The node layout is based on a spring [1] Due to data availability, the United Kingdom SRN was constructed using the 4-digit NACE 2 classification and then converted to the NACE 1.1 classification.
Readers are referred to the SI for a more detailed discussion regarding this conversion.
algorithm called 'Force Atlas' in Gephi, where more skill-related industries are positioned closer together. We have added labelling to indicate the general position of sectors in the network. We observe that service-orientated industries and government activities tend to be located on the left hand side of the network, whereas heavy goods, construction, manufacturing and agricultural sectors dominate the right. Retail (bottom) and business (top) activities lie in between.
The modular structure of the network is shown via the node colouring. The communities were detected using modularity [44]. Specifically, the partition maximizing the modularity function from 10000 iterations of the Louvain algorithm is adopted.
Using the same layout, we visualise the communities found for Germany, the Netherlands and Sweden on the right. This figure provides insight into how the modular structure varies across the different SRNs. First, we observe that the Irish SRN has more communities than the other 3 networks. The Irish SRN has 9 communities, the Netherlands and Germany have 7 communities, while Sweden only has 5 communities. We can also see that industries within certain sectors are always clustered together across all four networks (as indicated by solid colour blocks), such as financial intermediation, public administration and social security, agriculture and the hotel and restaurant sectors. These communities show universal structure in inter-industry skill-sharing. Furthermore, insights such as the increased subdivision of the manufacturing sector in Ireland compared to other SRNs can also be observed.
We now quantify the distance between these networks using our BiDir distance. The pairwise comparison of the various SRNs is shown in Figure 4 (F) [2] . We observe that Germany and the Netherlands are the closest in terms of modular structure with the smallest distance, and both are similar to Sweden. Ireland appears to have a very different modular structure compared to all other countries. Comparison of the Irish SRN and the German SRN uncover an asymmetric relationship. If we project the German communities onto the Irish SRN, we obtain a good fit -but not vice versa. Visual inspection of subfigures (A), (B) and (E) confirms that the communities of Ireland are somewhat nested inside the communities of [2] The United Kingdom's SRN is has fewer industries compared to the other four countries, and also much more sparse. This is because it was constructed from a longitudinal survey including just 1 per cent of UK workers. Hence, when we compare the UK to the other countries, we compare the subgraphs induced by the set of overlapping nodes (and corresponding edges) in each case. A more detailed comparison of the UK to the other countries is included in the SI.
Germany. For example, the manufacturing and social service sectors consists of many small communities in Ireland compared to just a few larger communities in Germany. Hence, inter-industry flows, and thus knowledge diffusion and skillsharing that constrains development paths, are more fragmented in Ireland. From an Irish policy perspective, the German SRN might be informative in the design of smart interventions. For example, policies could be developed to facilitate and encourage worker mobility between sectors in, for example, manufacturing that are highly interconnected in Germany, but not yet in Ireland.
Hence, our BiDir distance reveals a fairly universal modular structure in terms of inter-industry flows and skill-sharing across a set of Northern European countries. The directionality of our distance metric enables us to pick up nuanced differences between SRNs, and uncover unseen potential linkages which are key to policy efforts to generate regional industrial growth and diversification potential.

Different optimization functions
The flexibility of BiDir, in terms of its adaptability to a class of community detection algorithms (with the use of an objective function), is advantageous in that it facilitates the comparison of networks across a wide variety of applications. This is because different optimization functions define communities differently and will therefore often detect different partitions even on a single network. Thus far, we have illustrated the use of BiDir when using the modularity optimization function [41] as our objective function. Here, we illustrate BiDir distances obtained when comparing the modular structure of two networks using three different optimization functions. Specifically, we consider InfoMap [57], Markov stability [9] and modularity functions. Specifically, we illustrate the degree to which properties of an optimization function are inherited by and reflected in our distance measure.
We compare the modular structure of networks A and B illustrated in Figure 5 (A) and (B), respectively. Network A is a ring-of-rings and is an example of a network with non-clique-like communities (communities that have a high average diameter). The network is taken from Schaub et al [60]. Network B is a ring-of-circulant graphs C n 1, 2 [3] . It has a similar community structure as network A, however the clique-like structure of each community is enhanced (the average diameter of each community is decreased) by the addition of intra-community edges.
For modularity and InfoMap the partition is obtained by applying the Louvain-and internal InfoMap heuristic 1000 times respectively, and choosing the partition which either maximises the modularity or minimises the InfoMap function. Similarly, in the case of the multi-resolution stability algorithm (where time is the intrinsic resolution parameter), for each time a resulting partition is obtained by applying the Louvain heuristic 1000 times. For each time resolution, we identity the partition that maximizes the stability function. In order to choose the partition corresponding to [3] A circulant graph is a graph of order n in which the i th vertex is adjacent to the (i + j) th and (i − j) th graph vertices for each j in a so-called connection set. A circulant graph of order n and with connection set {1, 2} is denoted as Cn 1, 2 . In (A) we show the Irish SRN from O'Clery et al [52]. The nodes represent industries and edges the skill-relatedness between the two corresponding industries. Nodes are coloured according to their community (found using the modularity optimization algorithm). The node layout is based on a spring algorithm called 'Force Atlas' in Gephi. Labelling is added to indicate the general position of sectors within the network. Using the same layout we also show communities detected for Germany, the Netherlands and Sweden in sub-figure (B) to (D). In (E) we highlight the overlap of the community structure of each country with the official industrial sector classification. Industries are ordered and grouped by their 1-digit NACE 1.1 sector along the x-axis. For each country, industries are coloured according to community membership. In (F) we show the resulting BiDir distance from the pairwise comparison of the different SRNs. Notice that the German, Swedish and Dutch SRNs have relatively similar modular structure, while the Irish SRN's modular structure is less similar.
the optimal resolution, we calculate the mean variation of information [4] (VI) [32] [4] The variation in information is a metric used to measure the robustness of our partitions. The variance of information calculates the amount of information (in an information-theoretical sense) two partitions share. Two partitions that are similar will exhibit a low variation of information. For each resolution a pairwise variance of information is calculated for each partition obtained from the Louvain algorithm. The average of these is then the mean variation of information at the resolution. If this value is low, it shows that the detected partitions are similar, and therefore more robust [32]. Figure 5 We illustrate how the BiDir distance varies when using three different optimization functions, namely: Modularity, Infomap and stability. In (A) and (B) we illustrate network A and B, respectively, with node colouring corresponding to the communities detected. In (C) we show the BiDir distances obtained when comparing the modular structure of network A and B using each of the different optimization functions. A larger distance is obtained when using the modularity and Infomap function compared to the stability function. This is because the first two functions suffer from a 'field-of-view' limit and therefore over-partition network A. In general, the BiDir distance inherits the features of the optimization function deployed.
at each resolution, and choose the partition corresponding to the resolution with lowest VI. We consider the full range of partitions obtained at different resolutions in the next section.
The resulting partitions are visualised via node colouring. As noted by Schaub et al [60], we observe that both InfoMap and modularity over-partition network A (each ring is partitioned into multiple communities). This can be explained by the 'field-of-view' limit of which both of these functions suffer [19]. The 'field-of-view limit' is an upper limit on the effective diameter of the communities that can be detected with one-step dynamical community detection techniques [60]. As the rings in network A have high diameters (non-clique like structure) both these algorithms over partition the rings, creating communities with smaller diameters. However, this over-partitioning is not observed in the case of network B. This is because each of the circulants in network B have a smaller diameters and the optimization functions are able to partition them into their own community. As the Markov stability function is a multi-step dynamical community detection algorithm it does not suffer from the field-of-view limit [60], and obtains a community structure consistent with what we would expect for both network A and B.
In Figure 5 (C), we illustrate the BiDir distances (comparing network A and B) obtained when adopting each of these optimization functions. As expected, Infomap and modularity result in the largest distances. Both of these distances are also highly asymmetric. We find that if we project the communities of B onto network A, we obtain a good fit -but not vice versa. The BiDir distance therefore recognises the subdivided communities found using modularity and Infomap. As InfoMap results in the most severe over-partitioning of the network, it results in the largest and most asymmetric distance. As the Markov stability algorithm does not suffer from the 'field-of-view limit' it obtains the same community structure for network A and B, and results in a distance of (0, 0). The different distances obtained using the different optimization functions illustrates that the BiDir distance inherits the properties of the community detection algorithm used.

Multi-scale optimization functions
Although most well-known community detection algorithms seek to obtain a single node partition, in many cases it is more natural to analyse a range of partitions -from many small communities to a few large communities -when investigating the modular structure of a network. Such hierarchical structure can be informative, revealing layers of organisation. For example, consider a university friendship network. Larger communities may reveal institutional structures such as departments or colleges, while smaller communities might capture friendship or social circles.
Hence, when comparing the modular structure of two networks, one may want to consider not a single partition but a range of partitions for each network. This approach can reveal, for example, distinct distances corresponding to particular scales, and consequently uncover the resolution at which the two networks are most similar. For example, two networks may be quite similar in terms of their modular structure at a coarse scale, but distinct at a finer or more dis-aggregate scale. Taking further advantage of the flexibility of our method, here we illustrate the use of the BiDir distance for a multi-resolution optimization function.
To illustrate how we adapt the BiDir distance to this case, again consider two networks, A and B. We compute the ratio of A's F -score under B's optimal partition found at resolution β to its F -score under its own optimal partition found at resolution α, and vice versa. Formally, where P Aα is the optimal partition of network A obtained at resolution α and P B β is the optimal partition of network B obtained at resolution β. Hence, the final BiDir distance corresponds to: In practice, we calculate this distance for a range of α and β values such that small values of α or β correspond to fine partitions with many small communities, while larger values correspond to fewer larger communities [5] .
To illustrate this approach, we compare three families of networks generated by hierarchical SBMs that share modular structure at three different scales. All three SBMs consist of 300 nodes (see SI for more details). In Figure 6 A-C, we visualise the adjacency matrix for a single network generated by each SBM. Note that A and B share a coarse-level modular structure of three equally size communities. However, at a more granular-level, their community partitions are more dissimilar.
On the other hand, network A and C share a granular-level modular structure of 12 equally sized communities.
Next, we use the multi-resolution Markov stability community detection algorithm to detect a range of partitions for each of the networks. Recall, that this algorithm is based on a simple random walk model [9,29,30]. The key idea behind this method is that it sets a walker to roam on a network -jumping from node to node with probability proportional to the edge weight. If the walker gets trapped in a region of the network (a group of nodes) for a prolonged period this corresponds to a group of densely connected nodes which form a community. Here, the Markov 'time' is used as the resolution parameter. Intuitively, if we let a walker roam for longer periods on the network, the walker will detect larger and larger communities. Therefore, by varying the Markov times we detect communities on a range of scales: from many small communities to few large communities. More detail on this algorithm can be found in Lambiotte et al [30].
We adopt the Louvain algorithm [4] as our search algorithm to find the optimal node partition with respect to our objective function (the stability function [9]) at each resolution. We use variance of information (across a large number of realisations of the Louvain algorithm) to assess which of these partitions is the most robust. In sub-figure D and E, we display the number of communities and the mean variation of information for each resolution [6] . We highlight three resolutions that are both robust, with low variation of information, and correspond to three distinct scales of community structure.
First, we use the BiDir distance to compare these chosen partitions in the standard manner, as shown in (F). We clearly observe that the distance varies depending on the resolutions chosen. For example, partitions A 3 and B 3 , corresponding to larger scale coarse partitions of both A and B, are much more similar than A 1 and B 1 (fine) or A 2 and B 1 (moderate).
Next, we compare the modular structures of network A and B across the full range of resolutions. Figure 6 (G and H) shows values obtained for different (α, β) resolution pairs. As our distance is two-dimensional, (G) corresponds to direction d(A(α), B(β)) and (H) corresponds to direction d(B(β), A(α)). Consistent with our example above, we observe that A and B's modular structure is most similar (e.g., the distance is smallest) at a coarse scale with few large communities. As we increase the granularity of the community structure, the modular structure of the networks becomes more distinct (at a moderate scale) and then again more similar (at a fine scale). [6] Note that for all results, we calculate the average values obtained from generating 100, 000 network instances of each SBM and running the Markov stability algorithm with 1, 000 iterations of the Louvain heuristic on each network instance generated. In general, we observe that network A and B's modular structure is most similar (e.g., the distance is smallest) at a coarse scale with few large communities, while networks A and C are most similar at a dis-aggregate or finer scale.
We also observe an asymmetric relationship with respect to modular distance. For small α and moderate β we observe that d(A(α), B(β)) < d(B(β), A(α)). Hence, if we project the communities of network B onto network A, we obtain a good fitbut not vice versa. In this region network B has fewer and larger communities (9 communities) than network A (which has 12 communities). Thus, our metric picks up that the stability of both partitions are similar with regard to network A.
Conversely, when we compare networks A and C in (I and J), we observe that the smallest distance is obtained at a fine resolution corresponding to detection of smaller communities. As the resolution increases, and larger communities are detected, the modular structure of the networks becomes less similar. Here we ob-serve a stronger symmetric relationship between the two dimensions of the BiDir distance.

Discussion
Comparing community structure across networks remains a challenging task. The majority of current methodologies equate the community structure to the graph partition, but this approach ignores key topological information. While we also compare two partitions, by assessing the fit of one partition to the topology of the other network (and vice versa), the bi-directional distance we propose incorporates information on the network structure. This approach enables us to pick up similarities and differences in the underlying network topologies not captured in the partitions alone. Agnostic about the community detection quality function used within a broad class, this approach can be deployed across a wide range of applications.
The BiDir distance assumes that the partition assigned to each network is optimal (under some chosen community detection quality function). While this might appear like a strong assumption, the BiDir distance is well suited to handle cases where there exist alternative partitions that are close to optimal, since they all are assigned similar scores by the quality function. This feature enables our method to detect topological similarities that are not captured by (potentially dissimilar) partitions.
The multi-dimensionality of BiDir means that it is neither a metric nor a quasimetric in a formal sense. It is therefore important to be careful using it for certain applications. For example, when bench-marking different community detection algorithms using ground truth data, it will not produce a unique ordering. Furthermore, the scores given by BiDir are only comparable within a set of related networks (networks that are of the same size and exactly node-aligned). Similarly, since BiDir inherits the properties of the community detection algorithm used, the partitions to be compared should be derived from the same community detection algorithm. Further work is required to investigate if it is possible to normalize BiDir to enable more general comparisons of scores for different classes of networks.
A key aspect of community detection is that the specific algorithm is chosen according to whether structural, dynamic or other features are of interest. Since we use the quality function in our method, we retain this focus on the features of interest. However, our method is limited to algorithms that optimise an objective function. Extensions to other algorithms could be explored in future work. For example, if one was interested in comparing two networks using an SBM model, one could compare the statistical significance of two alternative SBM models (e.g., using a posterior odds ratio) instead of using an f -score.
Finally, our measure identifies the presence of similar modular structure between network pairs. However, when it comes to applications, often more information is required to understand 'where' in these networks similarities (or differences) occur. This was clearly illustrated in the comparison of the inter-industry labour flow networks, where a finer investigation of community overlap was used to shed light on sectors with either conserved or distinct patterns of labour flows across countries. The adaption of our bi-directional distance to a finer community-level distance is a potentially fruitful future research endeavour.

Funding
This project (NOC and DS) was funded by a Turing-HSBC-ONS Economic Data Science Award 'Network modelling of the UK's urban skill base'. ML received support from the Skye Foundation Trust and the Oppenheimer Memorial Trust.

Availability of Data and Materials
The methodology to reproduce the synthetic networks is outlined in the Supplementary Materials. The SRNs for Germany and Sweden are hosted by Frank Neffke [36] http://www.frankneffke.com/relatedness.html, and the SRN for the Netherlands is hosted by Dario Diodato [12] http://dariodiodato.com. The SRNs for Ireland and the UK are available upon request.

Competing interests
The authors declare that they have no competing interests.

Inter-industry labour flow networks
Here, we further elaborate on the comparison of the modular structure of the United Kingdom Skill-Relatedness Network (SRN) to the Irish, German, Dutch and Swedish SRNs using the BiDir distance.
Recall, the BiDir distance requires the networks being compared to be node-aligned (i.e. all the networks need to have the same node set). To ensure that this is the case, we first convert the UK network so that industries are defined according to the NACE 1 industry classification. We then take the node intercept of all the networks being compared. Next, we elaborate on each of these steps.
Although the UK SRN has been constructed using the methodology of Neffke et al [36], due to data availability it was constructed using the 4-digit NACE 2 industry classification. Using the official concordance table provided by Eurostat, we convert the industries from a 4-digit NACE 2 to a 4-digit NACE 1 industry classification. We adopt the network-based convergence methodology of Diodato [10].
Once all the networks are in the same industry classification, we take the node intercept of all the networks being compared. We observe that the UK's SRN has fewer industries compared to the other four countries. It is also much more sparse. This is because it was constructed from a longitudinal survey including just 1 per cent of UK workers (ASHE). On the other hand, all other networks were constructed using administrative data containing all workers within the formal sector. Hence, when comparing the UK SRN to those of other countries, only the subgraph induced by the set of overlapping nodes (and their corresponding edges) are compared in each case.
In Figure 7 (A) We show a visualisation of the UK SRN. Each node in the network represents an industry and each edge the skill-relatedness between its two corresponding industries. The same node layout and node labelling is used as in Figure 4 (A). The modular structure of the network is shown via the node colouring. The communities were detected using modularity [44]. Specifically, the partition maximizing the modularity function from 10000 iterations of the Louvain algorithm is The nodes represent industries and edges the skill-relatedness between the two corresponding industries. Nodes are coloured according to their community (found using the modularity optimization algorithm). The same node layout is used as in Figure 4. Labelling is added to indicate the general position of sectors within the network.
Using the same layout we also show communities detected for IE, DE, the NE and SE in sub-figure (B) to (E). In (F) we show the resulting BiDir distance from the pairwise comparison of the UK SRN to the other countries' SRNs.
adopted. Using the same layout, we visualise the communities found for Irish, Germany, the Netherlands and Sweden on the right. Note that for all the networks only the induced sub-graph of the SRNs are shown.
We now quantify the distance between the UK SRN and the other four countries' SRN using our BiDir distance. The pairwise comparison is shown in Figure 7 (F). We observe that the modular structure of the UK network is very different to those of the other countries. We also observe that all the distances show an asymmetric relationship. For all four countries a smaller distance is obtained when projecting the community structure onto the UK network than vice versa. As the UK network is particularly sparse, the modularity algorithm unveils a more fine-grained community structure. Our measure is therefore picking up this difference in the scale at which the communities are most robust.

Technical details regarding SBM construction
In Figure 3 we can see the performance of our framework in three different families of SBMs. We now summarize how these three different families were generated. All networks consist of 120 nodes and are generated according to the following parameters: • All networks in family A consist of 4 identical blocks of size 30, and the SBM parameters are P ij = 0.20 and P ii = 0.80, 0.60, 0.45, 0.35, 0.20 for each successive network. Since for the last network, A5, both probabilities are the same, we do not have any a priori community structure, and we can use this network as a point of comparison.
• For all networks in family B, P ij = 0.20 and P ii = 0.80. But each successive networks splits some of the blocks of the previous one. So B2 split one of the original blocks into two, B3 splits the other original block to obtain 4 identical medium blocks. B4 splits one of these 4 blocks into 3 smaller ones, and B5 repeats that procedure on the other 3 medium blocks.
• C1 and C2 are produced by again using P ij = 0.20 and P ii = 0.80, with the first one split into 3 blocks and the second one into 4 blocks. C3 has the same blocks as C2, but P 2,3 = P 3,2 = 0.40.
Furthermore, in Figure 6 we illustrate the use of the BiDir distance using a multiresolution community detection algorithm. Here, we provide more detail regarding how the three nested SBMs were constructed. All three of the networks consists of 300 nodes and have implanted communities at 3-levels. As illustrated in the adjacency matrices of each network, each network contains regions with one of 4 different shades of grey. These regions correspond to 4 different values of P ij . From darkest to lightest the regions correspond to the following four values of P ij ∈ {0.8, 0.53, 0.27, 0.15}. The size of the different regions for each network are given as follows: • Network A consists of 12 identical blocks of size 25. Two of these blocks are then merged together to form 6 identical blocks of size 50. Similarly, two of these blocks are then grouped together to form 3 identical blocks of size 100.
• Network C, consists of 12 identical blocks of size 25, similar to Network A. These blocks are merged together to form four communities containing three blocks of size 75. Finally, two 75-sized blocks are joined together to form two communities of size 150.