 Regular article
 Open Access
 Published:
Resilience analytics: coverage and robustness in multimodal transportation networks
EPJ Data Science volume 7, Article number: 14 (2018)
Abstract
A multimodal transportation system of a city can be modeled as a multiplex network with different layers corresponding to different transportation modes. These layers include, but are not limited to, bus network, metro network, and road network. Formally, a multiplex network is a multilayer graph in which the same set of nodes are connected by different types of relationships. Intralayer relationships denote the road segments connecting stations of the same transportation mode, whereas interlayer relationships represent connections between different transportation modes within the same station. Given a multimodal transportation system of a city, we are interested in assessing its quality or efficiency by estimating the coverage i.e., a portion of the city that can be covered by a random walker who navigates through it within a given time budget, or steps. We are also interested in the robustness of the whole transportation system which denotes the degree to which the system is able to withstand a random or targeted failure affecting one or more parts of it. Previous approaches proposed a mathematical framework to numerically compute the coverage in multiplex networks. However solutions are usually based on eigenvalue decomposition, known to be time consuming and hard to obtain in the case of large systems. In this work, we propose MUME, an efficient algorithm for Multimodal Urban Mobility Estimation, that takes advantage of the special structure of the supraLaplacian matrix of the transportation multiplex, to compute the coverage of the system. We conduct a comprehensive series of experiments to demonstrate the effectiveness and efficiency of MUME on both synthetic and real transportation networks of various cities such as Paris, London, New York and Chicago. A future goal is to use this experience to make projections for a fast growing city like Doha.
Introduction
In the past years scholars have increasingly realized that urban infrastructure modeling can not be addressed in a decoupled way: transportation networks in big cities are naturally multimodal, and as such commuters use different modes to move around the city. This implies that congestion in surface (car) commuting has large effects on other modes of transportation, e.g., bus or metro; the other way around, incidences in the metro system (e.g., temporary power failure in a station) will have severe consequences on the bus and private car systems. Provided that many cities—and particularly large metropolis—offer open data from all sorts of remote sensing devices, it is tempting to dive deep in those data so as to characterize such interwoven layers, and quantify their mutual effect on each other. In this paper, however, we intend to take one step back and address the question from a theoretical perspective to (i) represent the multimodal transportation system as a multiplex network; (ii) mathematically characterize the random walk coverage of this multiplex, and (iii) assess the robustness of such coverage when the system is confronted with failure. This strategy—setting a theoretical framework—provides an anticipatory understanding, for instance to avoid possible, unforeseen negative sideeffects of urban planning decisions. Also from an urban planning point of view, our proposal ultimately opens the path for an holistic route ranking, helping authorities to prioritize certain navigation strategies over others, in particular during mega events.
Regarding point (i) above, our work adds to the literature on multiplex networks, which has gained a lot of momentum in the last five years. As research on complex systems matured, it became essential to move beyond simple graphs and investigate more complicated (but more realistic) frameworks. At first sight, the expansion from “monoplex” to multiplex may be hailed as an easy one—from a network to a “stack” of networks. However, things turned out to be more complicated, and a generalization of “traditional” network theory had to be developed, e.g., see [1]. To begin with, an adjacency matrix can no longer encode the layertolayer interactions of multiplex systems, and rather supraadjacency matrices or adjacency tensors enter the scene, e.g., see [2–4]. This in turn modifies all the underlying algebra that lays at the base of monoplex network analysis, both regarding static descriptors—degree, transitivity, eigenvector centrality, modularity, etc. [5–8]—and dynamic processes [9], such as mobility on urban multiplexes. The latter—which is the focus of this contribution, see next Section—has been tackled only recently [10–12].
Needless to say, random walk dynamics—and its neighboring problems, e.g., Mean First Passage Time [13, 14] and network coverage [15, 16]—have a long tradition in network theory [17]. We here resort on De Domenico et al. [18], which offers the first theoretical generalization of random walks to the multiplex framework, as applied to navigability processes on multimodal transportation networks.
Finally, the concept of robustness has been central to network theory from the early 2000s [19, 20], because of its applied significance together with a longstanding tradition under the topic of percolation theory in Statistical Physics [21, 22]. Closer to urban questions, Arcaute et al. [23] have relied on percolation to explore the limits of regions and cities; Li et al. [24] propose an interesting dynamical percolation approach to unveil complex commuting dynamics in cities; and finally other works [25, 26] focus on the problem of infrastructural robustness and city design from the idea of progressive structure failure (removal of randomly chosen edges). More recently, Romero et al. [27] studied the impact of external stress on the structure of networks applied to social media platforms; and Baggio et al. [28] looked at the robustness of multiplex networks in a socialecological context. In the multilayer framework, percolation transitions have also been studied from a theoretical perspective, e.g., see [29].
This paper is organized as follows. In the next section,we present the data model used to represent a multimodal transporation system of a city, and formalize the problem of efficient computation of the coverage using random walkers. Then we introduce the Multimodel Urban Mobility Estimation algorithm; and in the validation section, we present the experimental evaluation of the model; with a conclusion at the end of the paper.
Data model and problems
Many physical realities can be modeled as sets of interconnected entities; and multilayer networks are used as a representation of these complex systems. We therefore observe many dynamical processes being studied on top of these networks, such as diffusion processes [30, 31], synchronization [32, 33], percolation [34, 35], etc. We use, in particular, multiplex networks to provide the comprehensive conceptual framework, see e.g., [1, 18, 30, 36–43], and random walks to study the mobility of commuters within a multimodal transportation network in a city. This will allow the development of optimal navigation strategies.
Multiplex networks
Given a set of L layers, each representing a type of relationship and containing N nodes. The relationship is represented by an edge and can be anything depending on the complex system, e.g., in social networks, it can be “friendship” on one layer such as Skype and “professional” on another layer, such as LinkedIn. For multimodal transportation systems, the nodes represent the components of the complex system, e.g., bus stations in the first layer, and metro stations in the second layer, etc. Even though the layers are different from each other, the commuters use both of them to move in a large city, and therefore it is important to represent their mobility by taking into account the coupling between layers.
A multilayer network is a pair where is a finite sequence of (directed or indirected, weighted or unweighted) intralayer graphs \(\mathcal{{G}}^{\alpha } = ( \mathcal{{V}}^{\alpha }, \mathcal{{E}}^{\alpha } )\), and is the set of interlayer connections between nodes of different layers \(\mathcal{{G}}^{\alpha }\) and \(\mathcal{{G}}^{ \beta }\), i.e.,
A multiplex network is a special type of multilayer network in which \(\mathcal{{V}}^{1} = \mathcal{{V}}^{2} = \cdots = \mathcal{{V}}^{L} = \mathcal{{V}}\), and the only possible type of interlayer connections are those in which a given node is only connected to its counterpart nodes in the rest of layers, i.e.,
Here, a nodelayer \(i(\alpha)\) means that node i participates in layer α.
In other words, multiplex networks consist of a fixed set of nodes connected by different types of links, see Fig. 1. The paradigm of multiplex networks is social systems, since these systems can be seen as a superposition of a multitude of complex social networks, where nodes represent individuals and links capture a variety of different social relations. In this study, we consider nodealigned multiplex networks, i.e., interlayer connections are “diagonal” in the sense that each node is connected only to its counterpart in the other layers, and the interlayer edges exist only between consecutive layers.
There have been some attempts in the literature for modeling multilayer networks properly by using the concept of tensors, e.g., see [6, 44]. In this study, we use proper matrix representation, and therefore the supraadjacency matrix of the multiplex network has the general form
where \({\mathbf{W}}^{(\alpha )}\) is the adjacency matrix of layer α, \({\mathbf{D}}^{(\alpha \beta )}\) is a diagonal matrix such that \(d_{ii}^{\alpha \beta }\) is the cost associated with the interlayer edge \([ i(\alpha ), i(\beta ) ]\), and \({\mathbf{D}}^{( \alpha \alpha )}\) is a diagonal matrix such that \(d_{ii}^{\alpha \alpha }\) represents the cost of staying in the same node and in the same layer.
Note that multiplex networks allow an easy integration of traversal times by adding weights to the different edges of the network. Weights of edges in the same layer will represent the time it takes to go from one station to another, whereas weights of edges connecting the same station in two different layers represent the time it takes to transfer from one mode of transportation to another. The weights of transferring can also take into account the frequency of each line, which is not part of this study. However, in some cases, frequency can be relevant in bus or rail networks.
Remark 1
The spectrum of the supraadjacency matrix (and its associated supraLaplacian matrix) is directly related to several dynamical processes that take place on a multilayer network, such as the diffusion dynamics [45], and the guarantee of a unique stationary state of the Markov process, e.g., see [46].
Represented this way, multiplex networks encode significantly more information than their single layers taken separately, since they include correlations between the role of the nodes in the different layers. For example, a node that is a hub in the metro layer is more likely to be a hub in the bus layer. Therefore, the degree of nodes in the metro layer is positively correlated with that of the bus layer. Negative correlations may also exist, when the hubs of one layer are not the hubs of another layer.
One limitation of multiplex networks, when all lines of a transportation mode are put in the same layer, is that they do not account for the cost of transferring between lines (at the same stop), especially when the stop is represented with the same node in that layer. However, there is a study by Aleta et al. that addresses this issue, see, e.g., [47].
Coverage by random walk
Random walks constitute a fundamental mechanism for many dynamics taking place on complex networks, e.g., see [48]. To assess the urban mobility in this multiplex transportation system, we model commuters as random walkers and we determine the coverage of the random walks, defined as the expected value of the number of steps to reach all nodes in the transportation system, regardless of the layer, on a walk that started from any nodelayer \(j(\alpha )\), i.e.,
i.e., it is the expected value of the number of nodes in the network being visited at least once in a time less than or equal to t, regardless of the layer, assuming that walks started from any other nodelayer in the network.
A random walk is a Markovian process [49], which means that the transitions between states are historyless, i.e., the probability of transitioning to the next state depends only on the current state, not on any of the other previous states. Moreover, at each time step, the random walker has three options: the first one is to stay at the same node, the second one is to move to other neighboring nodes on the same layer and the last one is to switch to one of its counterparts on other layers, as illustrated in Fig. 2.
The mathematical model, used in this paper, is inspired from the study in [18], and was clearly developed by us in [50].
Therefore, given a multiplex transportation system of N nodes and L layers, the discretetime master equation describing the probability of finding the walker in nodelayer \(i(\alpha )\), at time \((t+1)\), can be written as, e.g., see [18, 50, 51]
which can be assembled in matrix form as , where is the transition supramatrix (always assumed to be independent of time), and \({\mathbf{P}} \in {\mathbb{R}}^{NL}\) is a supravector containing the probability of finding the walker at any nodelayer \(i(\alpha )\), such that
For a classical random walk, the transition probability of moving from nodelayer \(i(\alpha )\) to nodelayer \(j(\alpha )\), i.e., within the same layer α, or to switch to the counterpart of vertex i in layer β, i.e., to nodelayer \(i(\beta )\), is uniformly distributed. Therefore we have
where \(w_{ij}^{\alpha }\) is the weight of the intralayer edge \([ i(\alpha ), j(\alpha ) ] \) and \(d_{(i)}^{\alpha \beta }\) is the weight of the interlayer edge \([ i(\alpha ), i(\beta ) ] \), i.e., the cost to switch from layer α to layer β at node i, while \(d_{(i)}^{\alpha \alpha }\) quantifies the cost of staying in the same node and in the same layer. These are the elements of the matrices \({\mathbf{W}}^{(\alpha )}\), \({\mathbf{D}} ^{(\alpha \beta )}\), and \({\mathbf{D}}^{(\alpha \alpha )}\) in respectively.
The intralayer strength of a nodelayer \(i(\alpha )\) is \(k_{i(\alpha )}\), and \(c_{i(\alpha )}\) is the interlayer strength of node i with respect to its connections to its counterparts in different layers. They are defined as
so that the total strength of nodelayer \(i(\alpha )\) is the sum, i.e., \(\kappa_{i(\alpha )} = k_{i(\alpha )} + c_{i(\alpha )}\).
Remark 2
Since each node is coupled only with its counterparts in different layers, then, only the elements of the type \(\mathcal{{A}}_{ii}^{\alpha \beta }\) are different from zero. Jumps to other nodes in the other layers, as in Lévy random walks, are not allowed, and therefore \(\mathcal{{A}}_{ij}^{\alpha \beta } =0\) for \(i\neq j\) and \(\alpha \neq \beta \).
Mathematical analysis of the model
In matrix form, it can be shown that the discretetime master equation (4) can be written as the initial value problem,
and without loss of generality, we assume that, at \(t=0\), the random walker is in the first layer at nodelayer \(j(1)\), i.e., \({\mathbf{P}} (t=0) = {\mathbf{P}}_{j(1)} (0)\) then the initial value problem admits the following solution
where is the usual matrix exponential, i.e.,
Remark 3
It is easy to see that \({\mathbf{P}}_{j(1)} (0) = [\mathbf{e}_{j}^{T} \ {\mathbf{0}}^{T} \ \cdots\ {\mathbf{0}}^{T}]^{T} \) with \({\mathbf{e}}_{j} \in {\mathbb{R}}^{N}\) being the canonical vector, and \({\mathbf{0}} \in {\mathbb{R}}^{N}\) is the vector of all zeros.
Theorem 1
Let be the diagonal matrix containing the total strength of all nodes, i.e., , where \({\mathbf{1}} \in {\mathbb{R}} ^{NL}\) is the vector of all ones, then . Therefore, the matrix is the normalized supraLaplacian of the multiplex network.
Proof
The supraLaplacian of the multiplex network is
Therefore, the matrix is the normalized supraLaplacian. □
The random walker can be at any layer, so let \(p_{i} (t)\) be the probability to find the walker in node i at time t, regardless of the layer, i.e.,
where \({\mathbf{E}}_{i} = [{\mathbf{e}}_{i}^{T}\ \cdots\ {\mathbf{e}}_{i}^{T} ]^{T} \in {\mathbb{R}}^{NL} \). Since , and using Equations (8) and (7), we get at time \((t+1)\) the following expression for \(p_{i} (t+1)\)
To determine the coverage, defined as in [18], let’s find an expression for the probability \(\delta_{i, j} (t)\) not to find the walker in vertex i after t time steps, assuming it started in vertex j, that is
From (10), we get the recurrence relation \(\delta_{i, j} (t+1) = \delta_{i, j} (t) [ 1  p_{i} (t+1) ] \), thus leading to the initial value problem
with \(\delta_{i,j} (0) = 0\) for \(j=i\) since the walker started in vertex j and the probability of not finding it in the same vertex is 0. In the case of \(j\neq i\), then \(\delta_{i,j} (0) = 1\). The solution to the initial value problem (11) is, see [18]
Therefore, the coverage is given by double averaging over all vertices the probability \([ 1  \delta_{i,j} (t) ] \), i.e.,
Theorem 2
The matrix need not be formed explicitly, since only its action on the vector \({\mathbf{P}}_{j(1)} (0)\) is needed, i.e., a matrixvector product, therefore
Moreover, since \({\mathbf{P}}_{j(1)}(0) = [{\mathbf{e}}_{j}^{T} \ {\mathbf{0}}^{T}\ \cdots\ {\mathbf{0}}^{T}]^{T}\), then
i.e., the jth column of and we get the following recurrences
Proof
These relations can be proven easily the usual way of proving recurrences, i.e., validate for the initial case, then assume it is correct for τ and prove that it is still correct for \(\tau + 1\). The details are skipped. □
Remark 4
The number of walks from node i to node j of length τ is the entry on row i and column j of the matrix . Therefore the matrix represents the total number of walks from node i to node j, of any length less than or equal to \((t+1)\).
Resilience to failures and percolation
Significant progress has been made in understanding the percolation properties of multilayer networks. For example, it has been shown that dependency links can have a serious impact on cascading failure events, in particular for interdependent networks. And, in many multilayer networks, some nodes of a layer are interdependent on nodes in other layers. A node is interdependent on another node in a different layer if it needs the other node to function in order to function itself properly. When two or more networks are interdependent, a fraction of node failures in one layer can trigger a cascade of failures that propagate in the multilayer network. This can mean that a network of networks as a whole may be more fragile than its constituent parts taken in isolation. A dramatic realworld example of a cascade of failures is the blackout that affected much of Italy in 2003, where the shutdown of power stations directly led to the failure of nodes in the Internet communication network, which in turn contributed to further breakdown of power stations, see [52]. Also, the work of Brummitt et al. in [53, 54] shows the importance of considering interconnected networks to better understand cascading failures. It is therefore critical to consider interdependent network properties in order to design robust networks.
It is now clear that the robustness of multilayer networks can be evaluated by calculating the size of their mutually connected giant component (MCGC) when a random failure affects a fraction of the nodes in the system, see the pioneering work in [52]. The MCGC of a multilayer network is the largest component that remains after the random failure propagates back and forth in the different layers.
The MCGC is defined as the set of nodes \(i(\alpha )\) that satisfy the following recursive set of equations, see [55]

(a)
at least one neighbor \(j(\alpha )\) of node \(i(\alpha )\) in layer α is in the MCGC;

(b)
all the interdependent nodes \(i(\beta )\) of node \(i(\alpha )\) are in the mutually connected giant component.
Network percolation theory has already been exploited in the urban context for purposes other than the ones in this work, e.g., see [24, 56, 57]. With the road networks for dozens of cities at hand, we can now proceed with the percolation dynamics in two different ways. Both of them share the idea of progressive structural deterioration [19, 20, 58], understood either as error or failure (removal of randomly chosen edges); or attack (removal of important edges, where “importance” can be quantified by some descriptor, such as high betweenness of edges, high centrality of nodes, etc.) Note that in this work we focus on bond percolation (the removal of edges) as opposed to site percolation (the removal of nodes).
To quantify the robustness of the multimodal transportation system, we use percolation theory [19] to describe the impact of edge failures in the multiplex on the coverage. We iteratively remove edges from the multiplex and compute the new coverage of the resulting network.
Computational approach
In [18], a numerical approach to estimate the coverage has been proposed. It is based on the eigendecomposition of the normalized supraLaplacian . The general form of the coverage has the following expression
where are constants depending on the vertex, the transition matrix, the eigendecomposition, and the initial conditions. Each supramatrix is obtained from products of the eigenvectors of the normalized supraLaplacian, and \(\boldsymbol{ \Lambda ^{0} }\) and \(\boldsymbol{ \Lambda ^{+} }\) indicate the sets of all null and positive eigenvalues of the normalized supraLaplacian, respectively.
Remark 5
Any solution approach based on the eigendecomposition is time consuming and hard to obtain, especially for large matrices. Therefore it should be avoided.
Proposed algorithm
The main kernel in computing the coverage is how to compute the exponent . For this, we propose the Multimodel Urban Mobility Estimation (MUME) Algorithm 1. Therefore, the way the coverage is computed here results in a tremendous saving in the computational time, as opposed to the eigendecomposition of the (normalized) supraLaplacian matrix proposed in [18].
Complexity analysis
Floating point operation (flop) is a simple, machineindependent measure of algorithm complexity. In multimodal transportation networks, we usually have a small number of layers, for example in our study, \(L=2\), since we consider a bus layer and a metro layer. Hence, in the MUME algorithm, we have one matrixvector product per iteration (Step 6) whose count is \(\ll 2 N^{2}\) flops, because of the sparsity of the matrix ; and one addition of 2Nvectors (Step 7) whose count is 2N flops.
Experimental evaluation
The main objective of this work is to study urban mobility challenges in modern cities, as well as the robustness and resilience of the complex transportation systems. Such work can serve as a basis for an automatic comparative evaluation of transportation system efficiency of different cities. The multilayer nature of the proposed framework requires data from different modes of transportation. However, it is found that not so many cities have collected, cleaned, and made data about their transportation systems publicly available. We thus limited our experimentation to four big cities: Paris, London, New York City, and Chicago.
We experiment with random graphs of different natures to derive more generalizable conclusions. In what follows, we present an overview of the data and the methods developed to produce the multiplex urban transportation network of every city from raw data. We then summarize and discuss our results for both convergence of coverage and robustness to failures.
Data
At the level of every city; we acquire, parse and combine GTFS (Google Transit Feed Specification)^{Footnote 1} datasets of every transportation mode. Google Transit Feed Specification is a format of data created to provide transit schedules and public transport information for specific geographical location. It is a “standard” developed by Google in order to help public transport agencies to publish and integrate their data with Google Maps. A typical GTFS feed includes information about multiple aspects of a transit system, such as stops, routes, trips, and schedules. Needless to say, the availability of these datasets is a key resource to study the dynamics of the transportation systems, e.g., see [59–62]. In our study, we use GTFS datasets to represent the anatomy of the public transportation system in all cities except London, and build a multiplex urban transportation network for every city.
In order to process and transform the combined GTFS datasets to multiplex system, we perform four tasks:
Merging GTFS dataset from different sources. Since the datasets come from various agencies and transportation companies which have adopted different indexes, the first step to reliably build transportation network after merging datasets is to reindex stop locations to avoid any conflicts. To do so, we join stations spatially (using latitude and longitude coordinates). We use text similarity matching techniques applied on stations’ names to double check our results.
Identifying and extracting routes. As we are interested in identifying connected locations, we start by filtering occasional trips, such as trips during national holidays, etc., from our dataset. Then, for each trip, we order stop locations based on departure time to identify connected locations.
Transportation network as a graph. We construct a graph of every transportation network in the city from the set of ordered stop locations per trip. Every set of these nodes represent a path in the network. As a result, we obtain a network for each mode of transportation in the city. As an example; for the case of Paris, the result of this step for both metro and bus networks is illustrated in Fig. 3.
Building a multiplex network. In our study, we represent the transportation network as a twolayer multiplex: Bus network and Metro network, as these two transportation modes represent the most significant urban transportation modes. The nodes of each layer represent the stop stations (bus stations or metro stations). As our multiplex system has to be ordinal and diagonal, we establish a connection (a link) between a node in one layer (e.g., bus) and its counterpart in the other layer (e.g., metro). We adopted an assumption according to which two nodes in two different layers that are within a walking distance radius (\(\leq 100~m\)) represent the same station (i.e, a station that provides a connection between the two transportation modes).
As stated in the modeling section, we assume that the transition probabilities are uniform at each node. That is, at each node, the random walker has the same probability to move through all possible edges, including those connecting to other layers.
We eliminate the nodes from layer 1 (respectively: layer 2) that don’t match any node in layer 2 (respectively: layer 1). We make sure to retain connectivity through the removed nodes by connecting their neighbors recursively. Note that by doing so, we could endup with a network that has much more edges than the initial one. So, in order to build a multiplex network, we used a simplified approach by keeping the same number of stations at every layer. We simply run a recursive algorithm to remove the nodes (stations) which do not have a counterpart in the other transportation layer, however, we retained the connectivity between the nodes. By removing the nodes, the algorithm increased the number of edges between the different remained nodes. This is for instance the case of Paris Bus network (see Table 1).
We apply the same process for each of the studied cities, and as a result, we obtain the multiplex representation of the urban transportation for every city.
In the case of the city of London, we use both (1) OpenStreetMap (OSM)^{Footnote 2} and (2) The National Public Transport Data (NPTDR).^{Footnote 3} OSM provides an updated map of different bus and metro stations in the city, whereas NPTDR contains a snapshot of every public transport journey in Great Britain for a selected week in October each year. While NPTDR database covers Great Britain (England, Scotland, Wales), we focus only on London city. First, we filter all the stations from NPTDR that are inside the bounding box of London city. Second, we extract all the stop points and trajectories of the two modes of transportation considered, i.e., bus and metro networks in this case. Then, we use these stop points and trajectories to build the graph of each layer. Next, we identify the interlayer edges that connect all the same nodes residing in both layers. Finally, we build a twolayer transportation multiplex for the city of London by merging both graphs and using the identified ordinal nodes.
Note that the hardest part about collecting data, is to use two sources of data for two modes of transportation, for example the metro network and the bus network for the city of London. When we started the resilience analysis, the data was not available for the city of London. Thus, we used other reliable data sources, Open Source Maps and The National Public Transport Data (NPTDR), in this case. The summary of the characteristic of each network is given in Table 1.
Convergence of the coverage
Given a multimodel transportation network and its corresponding multiplex representation, we are interested in how much of the network a random walker can visit (cover) within a given budget of time. The time budget can be substituted with a corresponding number of steps or movements that allow the walker to go from one node to one of its neighbors. The faster the coverage (i.e., fewer steps) the better. Indeed, the number of steps required to visit the entire network in a complete randomized setting is a very good indicator about the quality of the underlying multimodel transportation system.
We first run MUME to compute the coverage convergence curve on synthetic graphs. The idea is to build different multiplex networks of two layers to mimic the two transportation modes under study. We also enforce different configurations of the multiplex to capture the heterogeneity observed in the real networks of buses and metros as shown in Fig. 3. Thus we generate random graphs with heterogeneous degree distributions following Barabási–Albert (BA) model and other graphs with more homogeneous degree distributions following Erdős–Rényi (ER). Based on our empirical observation, we found that metro stations in general are quite somewhat similar to BA graphs, whereas bus graphs we generated resemble ER graphs. The reason of this interesting distinction resides in the way bus networks are generated in which we only keep bus stations that match metro stations, and then create edges between any two pairs of nodes (bus stations) for which there is a shortest path in the initial bus graph that doesn’t contain any metro station. This process naturally lead to a much denser graph as the number of paths in a graph is much higher than the number of its edges. Thus, we create three different multiplex networks configurations: BABA, BAER, and ERER. The first and third networks simulate cases where the two transportation modes share similar topological properties, whereas the second case simulates more realistic cases of transportation modes having different graph topologies. We fix the number of nodes in all graphs to \(N=100\) and vary the number of edges. In BA, we requested that each new node connects to two already existing nodes, while in ER we set the density score \(p=0.4\).
Practically, we vary τ, the number of steps, to take values in \([0, 1000]\) interval. We request MUME to compute the coverage score for each value of τ. We run this process several time and report on averages. Figure 4 plots the coverage curves of the different synthetic multiplex networks.
Several observations can be made here. First, while all three multiplex networks coverage converge within \(\tau =1000\) steps, it is interesting to see that ERER network reaches convergence faster. This is mainly due to the fact that the graphs on both layers in the ERER network are dense which leads to a smaller average shortest path in the whole network. This result is also partly explained by the homogeneous degree distribution of ER graphs which prevents the random walker from getting stuck in a local hub, unlike BA graphs that favor the formation of such hubs. Second, we found that BABA and BAER multiplex networks show exactly the same convergence behavior, despite the multiple runs performed. While further investigations need to be conducted to determine the actual reasons behind such behavior, we believe that the presence of one BA layer in the multiplex can heavily impact the random walker and get him into those dense regions and hubs in the BA layer that are not necessarily connected in the other layer.
Figure 5 reports the results of coverage convergence observed in the four cities studied here. We intentionally run the random walker for the same number of steps (\(\tau = 1000\)) for all cities to enable a direct comparison of the results. As expected, the convergence of the coverage happens faster in smaller graphs (Chicago) than bigger once (Paris and New York). Obviously, the smaller the number of stations a transportation system has, the fewer the number of steps required to cover all of them. We also see that the multimodal transportation network of Paris allows a higher coverage compared to New York and London. This is explained by the density of both “Metro de Paris” which is the second denser worldwide, and the great density of its corresponding bus network that has more that 38K edges (a complete graph of that size would have had approximately 45K edges). Another surprising yet interesting observation is the coverage achieved by the London multiplex network, which lays a little bit above 0.2 at \(\tau = 1000\), way behind the performances achieved by the other three cities. This is all the more surprising that both London metro network and bus network are of the same size as Paris networks. The reasons of such underperformance might be due to the fact that London networks have been generated from a different dataset which might be incomplete for the bus network (the metro network has been thoroughly verified by us).
Robustness to failures
Another important qualitative aspect of multimodal transportation systems is their ability to withstand random failures that may occur in the system. In reality, failures happen more frequently that one could imagine. A heavy traffic jam due to an accident, bad weather condition, or renovation work usually can take down an entire road segment which forces commuters to change their routes, especially in cases where bus drivers for instance cannot take initiatives on their own. It is also the case of metro stations, where frequent closure of segments happen due to electrical shutdowns, suspicious objects on the rails, maintenance work, etc. Thus, understanding the impact of such failures and the way they affect the entire system is of a great importance for cities.
To quantify the robustness of the multimodal transportation system, we use percolation theory [19] that nicely describes the impact of edge failures in the multiplex on the coverage. It is worth noticing that we are dealing with bond percolation as opposed to site percolation in which nodes are removed from the network instead.
For all multiplex networks we have created (three synthetic and four real), we iteratively remove a fraction of edges (5%) from both layers of the multiplex, and use MUME to compute the coverage achieved at \(\tau = 1000\) steps of the resulting network. As one could expect, the coverage score should be inversely correlated with the fraction of edges removed, i.e., the more failures there is, the harder it gets for the random walk to reach nodes.
Figures 6 and 7 show the degradation of the coverage as a function of the amount of removed edges in both synthetic and real multimodal transportation networks. Interestingly enough, we see in Fig. 6 that failures affect our three synthetic networks in three completely different ways. The most fragile multiplex network is BAER that gets almost completely disconnected with the removal of only 20% of its edges. This is followed by the BABA network whose coverage degradation is somewhat linear to the fraction of removed edges. ERER on the other hand demonstrates a strong robustness to failures with it securing more than 85% of its coverage when 80% of its edges are removed. While the results of ERER and BABA can be explained by the relatively high/low densities of their two basic forming graphs BA, ER (the higher the density, the better the robustness of the coverage). It is unclear why having two graphs of different topological structures severely fragilizes the whole integrated multiplex system.
Here, it is worth noting that most of the literature regards the BA graphs as being more robust to failure. However, most of these studies are “site” specific; and we may cite the works, e.g., in [63, 64], which look (even though partially) at the relationship between edge failure, robustness and network topology.
Unsurprisingly, the real transportation networks of the four cities behave just like the synthetic BAER multiplex network. High fragility is observed as networks lose more than 20% of their coverage after removing only 5% of their edges. The coverage tends to zero after the removal of 50% of edges. Despite this common fragility, one can see that Paris transportation system is slightly more robust to failures, followed by Chicago, New York City, and London.
Conclusion
The main objective of this study is to better understand and predict urban mobility patterns in the city, and analyze the robustness of the multimodal transportation system, i.e., its ability to withstand random and targeted failures. To do so, we model the multimodal transportation system as a “multiplex network” consisting of several layers that correspond to the different transportation modes available in the city; and we estimate the coverage of the city, which is defined as the average fraction of distinct vertices visited at least once during a time budget.
We first developed a mathematical framework to compute the coverage in a multiplex network setting, which we applied to different synthetic and reallife transportation systems built from four different cities, namely Chicago, London, New York, and Paris. Our experiments revealed different convergence patterns of the coverage in multiplex networks that are related to the topological characteristics of their underlying graphs. Dense and homogeneous graphs for instance lead to a faster convergence in general. Second, we looked at how different transportation networks react to failures and stress. Failures are simulated by the withdrawal of a small fraction of the edges from different layers, and coverage is computed for each removed fraction. A close inspection of the results showed that, unlike synthetic transportation networks, the four cities we studied behave quite similarly in terms of coverage degradation, with Paris network being the most robust among all. Moreover, one of the interesting findings of this work is the similarity between real transportation networks and BAER simulated networks.
As a future work, we intend to expand our mathematical framework to capture the actual commuting dynamics. Our focus will be to estimate the average travel time of commuters in different cities, and how it is affected by failures occurring in the system. We are developing a scalable computational framework to help planners in the city of Doha to efficiently manage the flow of people and intelligently handle capacity of their infrastructure. We hope that the developed computational tool will help the city of Doha to identify early problems, predict failures and design better transportation infrastructure in preparation for the FIFA 2022 world cup.
References
 1.
De Domenico M, SoléRibalta A, Cozzo E, Kivelä M, Moreno Y, Porter MA, Gómez S, Arenas A (2013) Mathematical formulation of multilayer networks. Phys Rev X 3(4):041022
 2.
Dunlavy DM, Kolda TG, Kegelmeyer WP (2011) Multilinear algebra for analyzing data with multiple linkages. In: Graph algorithms in the language of linear algebra, pp 85–114. https://doi.org/10.1137/1.9780898719918.ch7
 3.
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500. https://doi.org/10.1137/07070111X
 4.
Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 374–383
 5.
De Domenico M, Porter MA, Arenas A (2014) MuxViz: a tool for multilayer analysis and visualization of networks. J Complex Netw 3(2):159–176. https://doi.org/10.1093/comnet/cnu038
 6.
Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271
 7.
SoléRibalta A, De Domenico M, Gómez S, Arenas A (2014) Centrality rankings in multiplex networks. In: Proceedings of the 2014 ACM conference on web science, pp 149–155
 8.
SoléRibalta A, De Domenico M, Gómez S, Arenas A (2016) Random walk centrality in interconnected multilayer networks. Phys D: Nonlinear Phenom 323:73–79
 9.
De Domenico M, Granell C, Porter MA, Arenas A (2016) The physics of spreading processes in multilayer networks. Nat Phys 12:901–906. https://doi.org/10.1038/nphys3865
 10.
Chodrow PS, AlAwwad Z, Jiang S, González MC (2016) Demand and congestion in multiplex transportation networks. PLoS ONE 11(9):0161738
 11.
SoléRibalta A, Gómez S, Arenas A (2016) Congestion induced by the structure of multiplex networks. Phys Rev Lett 116(10):108701
 12.
Strano E, Shai S, Dobson S, Barthelemy M (2015) Multiplex networks in metropolitan areas: generic features and local effects. J R Soc Interface 12(111):20150651
 13.
Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. https://doi.org/10.1126/science.286.5439.509
 14.
Condamin S, Bénichou O, Tejedor V, Voituriez R, Klafter J (2007) Firstpassage times in complex scaleinvariant media. Nature 450(7166):77–80
 15.
Yang SJ (2005) Exploring complex networks by walking on them. Phys Rev E 71(1):016107
 16.
da Fontoura Costa L, Travieso G (2007) Exploring complex networks through random walks. Phys Rev E 75(1):016102. https://doi.org/10.1103/PhysRevE.75.016102
 17.
Noh JD, Rieger H (2004) Random walks on complex networks. Phys Rev Lett 92(11):118701. https://doi.org/10.1103/PhysRevLett.92.118701
 18.
De Domenico M, SoléRibalta A, Gómez S, Arenas A (2014) Navigability of interconnected networks under random failures. Proc Natl Acad Sci 111(23):8351–8356
 19.
Albert R, Jeong H, Barabási AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382. https://doi.org/10.1038/35019019
 20.
Cohen R, Erez K, BenAvraham D, Havlin S (2000) Resilience of the Internet to random breakdowns. Phys Rev Lett 85(21):4626–4628. https://doi.org/10.1103/PhysRevLett.85.4626
 21.
Molloy M, Reed B (1995) A critical point for random graphs with a given degree sequence. Random Struct Algorithms 6(2–3):161–180
 22.
Molloy M, Reed B (1998) The size of the giant component of a random graph with a given degree sequence. Comb Probab Comput 7(03):295–305
 23.
Arcaute E, Molinero C, Hatna E, Murcio R, VargasRuiz C, Masucci AP, Batty M (2016) Cities and regions in Britain through hierarchical percolation. Open Sci 3(4):150691
 24.
Li D, Fu B, Wang Y, Lu G, Berezin Y, Stanley HE, Havlin S (2015) Percolation transition in dynamical traffic network with evolving critical bottlenecks. Proc Natl Acad Sci 112(3):669–672
 25.
Abbar S, Zanouda T, BorgeHolthoefer J (2016) Robustness and resilience of cities around the world. ArXiv preprint. arXiv:1608.01709
 26.
Wang J (2015) Resilience of selforganised and topdown planned cities? A case study on London and Beijing street networks. PLoS ONE 10(12):0141736
 27.
Romero DM, Uzzi B, Kleinberg J (2016) Social networks under stress. In: Proceedings of the 25th international conference on world wide web. international world wide web conferences steering committee, Montréal, Québec, Canada, April 11–15, 2016. ACM, New York, pp 9–20. https://doi.org/10.1145/2872427.2883063
 28.
Baggio JA, BurnSilver SB, Arenas A, Magdanz JS, Kofinas GP, De Domenico M (2016) Multiplex social ecological network analysis reveals how social changes affect community robustness more than resource depletion. Proc Natl Acad Sci 113(48):13708–13713
 29.
Bianconi G, Dorogovtsev SN (2014) Multiple percolation transitions in a configuration model of a network of networks. Phys Rev E 89(6):062814
 30.
Gomez S, DiazGuilera A, GomezGardenes J, PerezVicente CJ, Moreno Y, Arenas A (2013) Diffusion dynamics on multiplex networks. Phys Rev Lett 110(2):028701
 31.
Watts DJ (2002) A simple model of global cascades on random networks. Proc Natl Acad Sci 99(9):5766–5771
 32.
Arenas A, DíazGuilera A, Kurths J, Moreno Y, Zhou C (2008) Synchronization in complex networks. Phys Rep 469(3):93–153
 33.
GómezGardeñes J, Gómez S, Arenas A, Moreno Y (2011) Explosive synchronization transitions in scalefree networks. Phys Rev Lett 106:128701. https://doi.org/10.1103/PhysRevLett.106.128701
 34.
Achlioptas D, D’Souza RM, Spencer J (2009) Explosive percolation in random networks. Science 323(5920):1453–1455. https://doi.org/10.1126/science.1167782
 35.
Radicchi FSF (2009) Explosive synchronization transitions in scalefree networks. Phys Rev Lett 106:128701. https://doi.org/10.1103/PhysRevLett.106.128701
 36.
Boccaletti S, Bianconi G, Criado R, Del Genio CI, GómezGardeñes J, Romance M, SendiñaNadal I, Wang Z, Zanin M (2014) The structure and dynamics of multilayer networks. Phys Rep 544(1):1–122
 37.
De Domenico M, Nicosia V, Arenas A, Latora V (2015) Structural reducibility of multilayer networks. Nat Commun 6:6864. https://doi.org/10.1038/ncomms7864
 38.
De Domenico M, SoléRibalta A, Omodei E, Gómez S, Arenas A (2015) Ranking in interconnected multilayer networks reveals versatile nodes. Nat Commun 6:6868. https://doi.org/10.1038/ncomms7868
 39.
Lee KM, Min B, Goh KI (2015) Towards realworld complexity: an introduction to multiplex networks. Eur Phys J B 88(2):1–20
 40.
Menichetti G, Remondini D, Panzarasa P, Mondragón RJ, Bianconi G (2014) Weighted multiplex networks. PLoS ONE 9(6):97857
 41.
Min B, Do Yi S, Lee KM, Goh KI (2014) Network robustness of multiplex networks with interlayer degree correlations. Phys Rev E 89(4):042811
 42.
Shai S, Dobson S (2013) Coupled adaptive complex networks. Phys Rev E 87(4):042812
 43.
SoleRibalta A, De Domenico M, Kouvaris NE, DiazGuilera A, Gomez S, Arenas A (2013) Spectral properties of the Laplacian of multiplex networks. Phys Rev E 88(3):032807
 44.
De Domenico M, SoléRibalta A, Cozzo E, Kivelä M, Moreno Y, Porter MA, Gómez S, Arenas A (2013) Mathematical formulation of multilayer networks. Phys Rev X 3(4):041022
 45.
Gomez S, DiazGuilera A, GomezGardenes J, PerezVicente CJ, Moreno Y, Arenas A (2013) Diffusion dynamics on multiplex networks. Phys Rev Lett 110(2):028701
 46.
De Domenico M, Solé A, Gómez S, Arenas A (2013) Random walks on multiplex networks. ArXiv preprint. arXiv:1306.0519
 47.
Aleta A, Meloni S, Moreno Y (2017) A multilayer perspective for the analysis of urban transportation systems. Sci Rep 7:44359
 48.
Guo Q, Cozzo E, Zheng Z, Moreno Y (2016) Lévy random walks on multiplex networks. Sci Rep 6:37641. https://doi.org/10.1038/srep37641
 49.
Wilson RJ (1996) An introduction to graph theory, 4th edn. Prentice Hall, New York
 50.
Baggag A, Abbar S, Zanouda T, BorgeHolthoefer J, Srivastava J (2016) A multiplex approach to urban mobility. In: Cherifi H, Gaito S, Quattrociocchi W, Sala A (eds) The 5th international workshop on complex networks and their applications. Studies in computational intelligence, vol 693. Springer, Cham, pp 551–563. https://doi.org/10.1007/9783319509013_44
 51.
Guo Q, Cozzo E, Zheng Z, Moreno Y (2016) Levy random walks on multiplex networks. ArXiv preprint. arXiv:1605.07587
 52.
Buldyrev SV, Parshani R, Paul G, Stanley HE, Havlin S (2010) Catastrophic cascade of failures in interdependent networks. Nature 464(7291):1025–1028
 53.
Brummitt CD, D’Souza RM, Leicht EA (2012) Suppressing cascades of load in interdependent networks. Proc Natl Acad Sci 109(12):680–689
 54.
Brummitt CD, Barnett G, D’Souza RM (2015) Coupled catastrophes: sudden shifts cascade and hop among interdependent systems. J R Soc Interface 12(112):20150712
 55.
Son SW, Bizhani G, Christensen C, Grassberger P, Paczuski M (2012) Percolation theory on interdependent networks based on epidemic spreading. Europhys Lett 97(1):16006
 56.
Arcaute E, Molinero C, Hatna E, Murcio R, VargasRuiz C, Masucci P, Wang J, Batty M (2015) Hierarchical organisation of Britain through percolation theory. ArXiv preprint. arXiv:1504.08318
 57.
Wang J (2015) Resilience of selforganised and topdown planned cities. a case study on London and Beijing street networks. PLoS ONE 10(12):1–20. https://doi.org/10.1371/journal.pone.0141736
 58.
Callaway DS, Newman ME, Strogatz SH, Watts DJ (2000) Network robustness and fragility: percolation on random graphs. Phys Rev Lett 85(25):5468–5471
 59.
Delling D, Pajor T, Werneck RF (2014) Roundbased public transit routing. Transp Sci 49(3):591–604
 60.
Catala M, Dowling S, Hayward D (2011) Expanding the google transit feed specification to support operations and planning. Technical report
 61.
Puchalsky CM, Joshi D, Scherr W (2012) Development of a regional forecasting model based on Google transit feed. In: 91st annual meeting of the transportation research board, Washington, DC, USA, pp 1–17
 62.
Wong J (2013) Leveraging the general transit feed specification for efficient transit analysis. Transp Res Rec 2338:11–19
 63.
Lin Y, Kang R, Wang Z, Zhao Z, Li D, Havlin S (2017) Robustness of networks with dependency topology. Europhys Lett 118(3):36002
 64.
Abbas W, Egerstedt M (2012) Robust graph topologies for networked systems. IFAC Proc Vol 45(26):85–90. https://doi.org/10.3182/201209142US4030.00052
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper.
Availability of data and materials
Detailed bus and metro network data are made available through open data portals. For metro networks, we parse GTFS (Google Transit Feed Specification) datasets for every city. For NYC, we used GTFS data from NYC MTA, see http://web.mta.info/developers/download.html. For Chicago, we used GTFS data from Chicago Transit Authority, see http://www.transitchicago.com/developers/gtfs.aspx. For Paris, we used GTFS data from RATP, see https://data.ratp.fr/explore/?sort=modified. For London, we used GTFS data from TFL Open Portal, see https://api.tfl.gov.uk/. These data sets generally are well maintained; however, many properties are often incomplete or missing entirely. For this purpose, we infer required road characteristics to build realistic and routable road networks using OpenStreetMap, an opensource crowd sourced mapping tool. The source code can be made available to other researchers for reproducibility and to build on top of our work.
Author information
Affiliations
Contributions
The authors contributed equally to this research article. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Abdelkader Baggag.
Ethics declarations
Ethics approval and consent to participate
The manuscript reports research which does not require any approval by ethics committee(s).
Competing interests
The authors declare that they have no competing interests.
Consent for publication
All contributing authors declare their consent for the final accepted version of the manuscript to be considered for publication in the journal EPJ Data Science.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Multiplex networks
 Robustness
 Resilience
 Coverage
 Random walker
 Multimodal transportation
 Random and targeted failures