# Personalized routing for multitudes in smart cities

- Manlio De Domenico
^{1}Email author, - Antonio Lima
^{2}, - Marta C González
^{3}and - Alex Arenas
^{1}

**4**:1

https://doi.org/10.1140/epjds/s13688-015-0038-0

© De Domenico et al.; licensee Springer. 2015

**Received: **22 July 2014

**Accepted: **14 January 2015

**Published: **28 January 2015

## Abstract

Human mobility in a city represents a fascinating complex system that combines social interactions, daily constraints and random explorations. New collections of data that capture human mobility not only help us to understand their underlying patterns but also to design intelligent systems. Bringing us the opportunity to reduce traffic and to develop other applications that make cities more adaptable to human needs. In this paper, we propose an adaptive routing strategy which accounts for individual constraints to recommend personalized routes and, at the same time, for constraints imposed by the collectivity as a whole. Using big data sets recently released during the Telecom Italia Big Data Challenge, we show that our algorithm allows us to reduce the overall traffic in a smart city thanks to synergetic effects, with the participation of individuals in the system, playing a crucial role.

## Keywords

## 1 Introduction

Rapid development of wireless communication and mobile computing technologies call new research that explores the responses of urban systems to the flow of instant information. Thus, the analysis of spatial signals becomes an increasingly important research theme.

The required four steps to model trips consist of calculating trip generation, trip distribution, modal split and route assignments. The sources to inform these steps traditionally have come from travel diaries and census data [1]. However, the presence of new information and communication technologies (ICT) provide big data sources that are allowing novel research and applications related to human mobility. Recent studies have advanced the knowledge on trip generation by studying the number of different locations visited by individuals through mobile phones and quantifying their frequent return to previously visited locations. These have demonstrated that the majority of travels occur between a limited number of places, with less frequent trips to new places outside an individual radius [2, 3]. In the domain of trip distributions, new models have helped us to predict number of commuting trips when lacking data for calibration [4].

An important topic is to explore route assignments in the context of smart multimodal systems [5, 6], where individual daily trips follow recommendations based on personal and global constraints. This is of special interest towards efficient cities, where individuals could be automatically routed reducing the probability of traffic congestion and at the same time reducing the environmental impact. From the individual’s point of view, for instance, one might want to choose a trip which minimizes the amount of traffic along the route, or to avoid routes across areas with high criminality level, or to favorite routes across more touristic areas, *etc*. On the other hand, the choices of certain routes at individual level, without accounting for the *state of the system*, often leads to traffic congestion [7, 8] which, in turn, is responsible for increasing pollution while decreasing the quality of the environment, with evident impact on the community.

In this work we model the trips in an urban system as interacting particles with data-driven origin-destination pairs that can be routed in their trips. Their route choices are based in a time-varying potential energy landscape that seeks to satisfy individual’s and community’s requirements simultaneously. Main streams methods for distributed routing seek to avoid congestion by global travel time reduction based on optimization methods [7, 9]. More recently, adaptive path optimization on networks (London underground network and global airport network) related the problem to physics of interacting polymers [10]. In this work we go one step forward in that direction and use a framework based on potential energy landscapes to integrate diverse layers of constraints to favor certain routes and to study the effects of the level of adoption of the proposed recommendations. In this work our main focus is to explore a new framework of analysis to study routing strategies for urban mobility, while the road network constrains are left to further studies.

## 2 Data-driven routing of human mobility

We consider a geographic area of interest (e.g., a city, a district, *etc*.) and we discretize it into a grid \(\mathcal{G}\) with size \(L\times L'\). In the following, for sake of simplicity, we will consider squared grids with size *L*.

We model individuals moving within the grid as a complex system of interacting sentient particles whose goal is to move between two geographic points according to certain criteria. Each criterion is encoded by a matrix **C**, with the same dimension of the grid, where each entry indicates the state of the corresponding cell in \(\mathcal{G}\). In the same spirit of physical models of an electromagnetic surface, we use the convention that \(C_{ij}>0\) indicates a *repelling* cell, i.e., a geographic area that should be avoided. Similarly, \(C_{ij}<0\) indicates an *attracting* cell, i.e., a geographic area that should be involved for routing. Areas where \(C_{ij}=0\) are considered as neutral.

The origin of a constraint can be of different nature. In fact, there are constraints at individual level, i.e., the ones corresponding to requirements of the single user (*e.g.*, avoid areas with high criminality level), and at global level, i.e., the ones corresponding to the requirement of the whole community (e.g., keep minimum the pollution level). Moreover, there are *static (or quasi-static) constraints* corresponding to restrictions that do not change over time or change over large temporal scales, and *dynamic constraints* corresponding to rapid changes within the system itself, like the traffic flow or the weather. On one hand we should account for individuals’ goals and requirements, while on the other hand it is crucial to satisfy constraints imposed for the wealth of the community.

*M*is the total number of constraints. In the case of static constraints, the matrix is considered constant over time. Moreover, the entries of each matrix are rescaled to the range \(0\leq C^{(\alpha)}_{ij}\leq1\), for all values of

*i*,

*j*and

*α*to assign a relative importance to each constraint and to settle on a common scale. Finally, the total constraint matrix is defined by the linear combination of such constraints at each time step:

*etc*.), it could be necessary to change their value to satisfy different priorities.

We define another matrix, \(\mathbf{D}_{\ell}(t)\) (\(\ell =1,2,\ldots ,N(t)\)), encoding the starting and destination cells of each individual in the system, where the starting point is considered to be a repelling or neutral area and the destination point is an attractor. The number of individuals \(N(t)\) is allowed to change over time. The matrix \(\mathbf{D}_{\ell}(t)\) might change over time because, in principle, the individual might change destination during his or her travel, and for simplicity we assume that \(-1\leq D_{ij}\leq0\) for each individual. It is worth remarking that attracting cells are in general associated to destinations and should be encoded in the set of matrices \(D_{\ell}(t)\), whereas repelling cells are associated to constraints and should be encoded only in the set of matrices \(C^{\alpha}(t)\).

*ℓ*, respectively, and let \(r = \sqrt{(i_{\ell }^{(d)}-i_{\ell})^{2} + (j_{\ell}^{(d)}-j_{\ell})^{2}}\) indicate their distance. The potential energy landscape is defined by

The choice of the value of the potential at the destination is somehow arbitrary and, as a rule of thumb, it should be a number smaller than the potential of the neighbors (whose distance is \(r=1\) or \(r=\sqrt {2}\), the latter if movements along diagonals are allowed), but not so small to avoid a potential well so deep that the rest of the landscape is almost flat.

*a*is a non-negative number whose inverse \(\tau=a^{-1}\) defines the time scale for convergence to 1 and

*b*is the relative importance to be assigned at time \(t=0\) to constraints and destination. A reasonable choice is to balance the two potential energy landscapes to allow the particles to be routed according to the constraints

*and*the destination up to a time scale

*τ*, above which the influence of the destination becomes more important. Small values of

*b*might give more importance to the constraints rather than destination, leading to a routing less oriented to the final destination during the first time steps. Therefore, we require \(\gamma (0)\geq1-\gamma(0)\) leading to \(b\geq0.5\).

*p*of individuals. To account for such a fraction

*p*of individuals, we consider a set of \(N(1-p)\) individuals moving along shortest paths between pairs of origin and destination, sampled from real data as discussed further in the text, and a set

*Np*of individuals moving randomly in the city, i.e., following random walks instead of shortest paths. We indicate by \(F_{\mathrm{in}}(r,t)\) the potential corresponding to the flow of individuals

*within*the system, i.e., those ones following suggestions from the smart system, and by \(F_{\mathrm{out}}(r,t)\) the potential corresponding to the flow of individuals

*out*of the system. The latter is modeled by a noisy flow in terms of random walking individuals, although other mobility models can be used. In order to preserve conservation of the flow, we rescale each term by the number of particles in the most visited cell, i.e., by a weight \(m(t)=\max[ \mathbf{F}(t)]\), being \(\mathbf{F}(t)\) the matrix accounting for the flow of individuals in the city at time

*t*, with \(\sum_{\mathrm{cell}\in\mathcal{G}}\mathbf{F}(t)=N(t)\). The matrix \(\mathbf {F}(t)\) is not weighted by the factor \([1-\gamma(t)]\) as in the case of \(C_{\sigma}(r)\) and \(C_{\delta}(r,t)\), because it would wash out the contribution of \(\mathbf{F}(t)\) to the potential landscape for increasing time. This choice makes our model more realistic: in fact, while it is possible to decide to traverse an undesirable area to balance the time spent looking for alternatives, it is not possible to traverse those areas which are congested or overcrowded. Therefore, the potential energy landscape accounting for the traffic flow should not be weighted by the function \(1-\gamma(t)\), whose existence is justified only to introduce a trade-off between the needing to reach the destination and the time spent to achieve this goal while accounting for personalized constraints. Finally, Eq. (3) maps to

This model is rather general, accounting for the presence of traffic and, simultaneously, for personalized and collective, static and dynamic, constraints. However, in this study we focused only on static constraints and we aggregated time-varying constraints for simplicity. It is worth remarking here that the potential landscape \(V_{\ell }(r,t)\) experienced by individual *ℓ* still changes over time, because of the traffic flow term. Moreover, if agents are distributed in the grid according to the underlying population distribution and they move along shortest-path adapting over time in the evolving potential landscape, it is not possible to perform quantitative predictions about the state of the full system at a given time without numerical simulations.

## 3 Overview of the dataset

For simplicity, we considered four static layers obtained from the provided datasets and here we explain how the layers were generated. The ‘pollution’ layer was generated from readings of 7 sensors scattered around the city, taken hourly over the course of 2 months. Because these sensors are very sparse in space, we smoothed their readings conveniently. The ‘events’ layer was generated by looking at the number of tweets coming from each grid of the city. It contains 100,000 geolocated tweets generated over a 30-day period. Lastly, the ‘crime’ layer was generated from a list of crimes, manually curated, and sourced from newspaper articles.^{1} It contains 1276 crimes happened during the course of 12 months in Milan and reported by newspapers and local media.

Finally, we used data about the total number of calls and texts generated in Milan by all users of a mobile carrier, over a period of two months. We used the aggregated fraction of calls and texts between areas of the city, aggregated over the whole 2-month period, to determine the distribution of trip origin and destination, as detailed in the next session.

## 4 Simulation of personalized routing

We performed massive simulations of personalized routing in Milan to gain insights about which factors influence the time required to complete a journey.

We started by exploring different ways to sample origin and destination cells for each individual in the city. The simplest strategy would be to choose both origin and destination with uniform probability on the grid. Of course, this strategy can not be realistic for several reasons. On one hand, the population is never uniformly distributed over metropolitan areas like Milan, where there is a high concentration of individuals in the ‘core’ of the city, while the population density decreases for increasing distance from the city centre [11]. In fact, assuming a uniform distribution of origins implicitly considers a population uniformly distributed. On the other hand, the choice of a random destination, regardless of the origin, is not representative of real urban mobility, where individual’s journeys show a high degree of spatio-temporal regularity, with a few highly frequented locations [12–14] and high predictability of the underlying trajectories [3, 15, 16].

*synergy*of the system and we calculate the time required to reach the destination for each individual. The remaining fraction

*p*of individuals does not follow the recommended routes. We found that the underlying synergy has a non-negligible effect on the way individuals experience mobility the city. Our results, shown in Figure 4, put in evidence that the average time required to complete a journey decreases for increasing synergy, i.e., for increasing adoption of the personalized routing. This result was expected: when only a small fraction of individuals moves along the routes suggested by our system, it is not possible to calculate efficient trajectories because the only information available to the system is about the traffic generated by other people, while the information about their origin and destination is unknown. Conversely, when a large number of individuals adopts the suggested routes the potential energy landscape is less subjected to noisy fluctuations and a more efficient calculation of trajectories can be performed. For comparison, we show in the same figure the distribution of journey duration in the non-physical scenario where each individual travels without constraints of any type, such as traffic,

*etc*. This optimal case, shown in figure for comparison, is a free-flow scenario where every person goes to their destination undisturbed by other people. Individuals’ routes were sampled according to origin-destination matrix also in this case. While it is not possible to fit the distribution of the ideal journey duration, our results show that a 100% synergy produces a distribution close to the ideal one. It is worth remarking that this analysis would be able to quantify the benefits of synergy for urban traffic if information on the individual adoption of routing technology could be available to researchers.

*ℓ*we calculate the mean speed at time

*t*by

*t*and the time required to travel. Here \(t_{\ell}^{(0)}\) indicates the time at which the particle has been injected into the system, i.e., the time at which the individuals leaves the origin of his or her route. The temperature of this system can be defined as the mean squared speed \(\langle v_{\ell}^{2} \rangle_{\ell}\). This measure is better understood in terms of permeability (or connectivity) of the city, as defined in urban studies allowing us to quantify how fast individuals flow through the city. Therefore, we define the permeability by

In the bottom panel of Figure 5 we show the anomaly changing over time. The traffic experiences large fluctuations for large values of *t*, positive and negative ones, alternating periods of high permeability with a few periods of low permeability. This is due to a few overcrowded cells that are quickly and automatically uncrowded by the system itself. Therefore, it is possible to monitor the traffic of the city by looking at the permeability and its anomaly over time, programming different alert levels such as low (\(-2\leq\mathcal {A}(t)<-1.7\)), medium (\(-2.6\leq\mathcal{A}(t)<-2\)) or critical \(\mathcal{A}(t)<-2.6\).

## 5 Discussion and conclusions

We have presented a strategy to route individuals between pairs of points of interest according to constraints of different type. Our method accounts for the simultaneous inter-playing between personalized constraints, as avoiding specific areas of the city because of personal choices, and collective constraints, from pollution reduction in certain areas of the city to the presence of adverse atmospherical conditions requiring targeted intervention. We have shown that the synergy plays a fundamental role in designing a smart city: only when all individuals take part in the routing system and move according to the recommended routes, the overall traffic in the city is closer to the most ideal mobility scenario. In the presence of real time information, our method allows to monitor the state of the city in real time, automatically identifying areas that are experiencing a temporary congestion and giving authorities the possibility to intervene timely.

Finally, the potential applications of our routing strategy are multiple. For instance, for certain values of the parameters (i.e., \(a=b=0\), leading to \(\gamma(t)=0\)), we obtain a routing strategy from an origin and without a fixed destination, while accounting for specified constraints. This case could be useful to perform automated routing of objects or individuals through the city. For instance, it would be possible to route cars or drones which are collecting data about the city (as Google cars) and to route people in charge of social services like cleaning the streets or performing targeted intervention, as disseminating salt in areas with snow. An additional application could be in the field of social security, to route police cars in areas with high crimes rate. Finally, our framework can help decision-makers to real-time application of urban mobility policies in responses to crisis, e.g. the emergence of hotspots of infection in specific areas of the city (or a larger area) can be incorporated into the model to avoid people passing through dangerous areas before physical quarantine is employed.

## Declarations

### Acknowledgements

MDD is supported by the European Commission FET-Proactive project PLEXMATH (Grant No. 317614), AA by the MULTIPLEX (grant 317532) and the Generalitat de Catalunya 2009-SGR-838. AA also acknowledges financial support from the ICREA Academia, the James S. McDonnell Foundation, and FIS2012-38266. MCG acknowledges Accenture and the KACST-Center for Complex Engineering Systems.

**Open Access** This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

## Authors’ Affiliations

## References

- Hazelton ML (2008) Statistical inference for time varying origin-destination matrices. Transp Res, Part B, Methodol 42(6):542-552 View ArticleGoogle Scholar
- Schneider CM, Belik V, Couronné T, Smoreda Z, González MC (2013) Unravelling daily human mobility motifs. J R Soc Interface 10(84):20130246 View ArticleGoogle Scholar
- Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science 327(5968):1018-1021 View ArticleMATHMathSciNetGoogle Scholar
- Simini F, González MC, Maritan A, Barabási A-L (2012) A universal model for mobility and migration patterns. Nature 484:96-100 View ArticleGoogle Scholar
- De Domenico M, Solé-Ribalta A, Gómez S, Arenas A (2014) Navigability of interconnected networks under random failures. Proc Natl Acad Sci USA 111(23):8351-8356 View ArticleMathSciNetGoogle Scholar
- Gallotti R, Barthelemy M (2014) Anatomy and efficiency of urban multimodal mobility. Sci Rep 4:6911 View ArticleGoogle Scholar
- Youn H, Gastner MT, Jeong H (2008) Price of anarchy in transportation networks: efficiency and optimality control. Phys Rev Lett 101(12):128701 View ArticleGoogle Scholar
- Wang P, Hunter T, Bayen AM, Schechtner K, González MC (2012) Understanding road usage patterns in urban areas. Sci Rep 2:1001 Google Scholar
- Delling D, Goldberg AV, Pajor T, Werneck RF (2011) Customizable route planning. In: Experimental algorithms. Springer, Berlin, pp 376-387 View ArticleGoogle Scholar
- Yeung CH, Saad D, Wong KM (2013) From the physics of interacting polymers to optimizing routes on the London underground. Proc Natl Acad Sci USA 110(34):13717-13722 View ArticleMATHMathSciNetGoogle Scholar
- Makse HA, Havlin H, Stanley H (1995) Modelling urban growth. Nature 377:19 View ArticleGoogle Scholar
- Gonzalez MC, Hidalgo CA, Barabasi A-L (2008) Understanding individual human mobility patterns. Nature 453(7196):779-782 View ArticleGoogle Scholar
- Lima A, De Domenico M, Pejovic V, Musolesi M (2013) Exploiting cellular data for disease containment and information campaigns strategies in country-wide epidemics. arXiv:1306.4534
- Salnikov V, Schien D, Youn H, Lambiotte R, Gastner M (2014) The geography and carbon footprint of mobile phone use in cote d’ivoire. EPJ Data Sci 3(1):3 View ArticleGoogle Scholar
- Song C, Koren T, Wang P, Barabási A-L (2010) Modelling the scaling properties of human mobility. Nat Phys 6(10):818-823 View ArticleGoogle Scholar
- De Domenico M, Lima A, Musolesi M (2013) Interdependence and predictability of human mobility and social interactions. Pervasive Mob Comput 9(6):798-807 View ArticleGoogle Scholar
- Crandall DJ, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. Proc Natl Acad Sci USA 107(52):22436-22441 View ArticleGoogle Scholar
- Farrahi K, Emonet R, Cebrian M (2014) Epidemic contact tracing via communication traces. PLoS ONE 9(5):95133 View ArticleGoogle Scholar
- Palchykov V, Mitrović M, Jo H-H, Saramäki J, Pan RK (2014) Inferring human mobility using communication patterns. Sci Rep 4:6174 View ArticleGoogle Scholar