The impact of social segregation on human mobility in developing and industrialized regions
© Amini et al.; licensee Springer 2014
Received: 25 January 2014
Accepted: 22 May 2014
Published: 6 June 2014
This study leverages mobile phone data to analyze human mobility patterns in a developing nation, especially in comparison to those of a more industrialized nation. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.
Keywordspredictive human mobility social networks cultural diversity
Transportation and communication networks form the fabric of industrialized nations. The roll-out of such infrastructure in such regions can play a major role in supporting, or deterring, a regions’ ability to thrive economically and socially. Likewise, citizens’ use of these networks can tell us much about the region, including insight on how ideas and diseases may be spreading, or how to most effectively augment services, such as health care and education .
Existing studies of mobile phone data have given us insight on numerous aspects of human mobility [2–6]. However, these studies tend to focus on regions with the highest mobile phone coverage, which also happens to be in more stable, mature, and developed regions. Thus, the models produced based on this data might not be as appropriate for developing regions with a substantially different patterns of social interactions and human mobility. However, these highly industrialized and wealthy regions represent less than one-third of the world’s population, with the remaining two-thirds living in developing and less economically mature regions. Accurate models for developing regions are critical as these regions are facing the most rapid demographic and economic shifts worldwide, and are in even greater need of such models to help inform policy makers, urban planners, and service providers. Yet, little work has been done to assess the appropriateness of models conceptualized for industrialized regions for use in developing regions.
Obtaining a comprehensive and accurate dataset of the telecommunications activity in developing regions can be extremely difficult due to security and privacy considerations, limited coverage by any single provider, and the need for a rigorous data capture methodology and infrastructure. The Data4Development (D4D) dataset  provided a unique opportunity by collecting data throughout the Ivory Coast and releasing it specifically for research purposes, so that developing regions could also be analyzed in greater detail. The contrast of long-standing cultural and linguistic diversity with relatively recent and rapid urbanization offered researchers a unique opportunity to understand the communication and mobility patterns and needs of a developing nation during key phases of its transformation.
Our study brought the D4D data from Ivory Coast together with mobility data from an industrialized nation (Portugal) in order to assess the ability of human mobility models developed for industrialized regions to accurately model developing regions. We focus on the comparison of these two countries at very different stages in their industrialization and in their levels of cultural and linguistic diversity, and sheds new light on the applicability of metrics and models conceptualized for industrialized regions to developing regions. Our results demonstrate the importance of considering cultural and linguistic diversity in the construction of new models to address the challenges of developing regions. The insights gained from our study have important applications to policymaking, urban planning, and the services deployments that are transforming Ivory Coast and many other developing countries.
In the following sections, we provide additional details on the data used in this study, the results derived, and the conclusions drawn.
Leveraging mobile phone data to elucidate and quantify many aspects of human life is growing in popularity. For example, mobile phone data has been used to gain insights from a diversity of cultures, ranging from university students to professionals in the US, Finland, and Africa . Targeted cultural patterns included pace of life, reaction to outlier events, and social support, as opposed to the mobility focus of our study. Eagle et al used mobile phone communication logs and top up records to conduct a comprehensive comparison of urban and rural life within a small country, as opposed to across countries as targeted by our research . Mobile phone data has also been used to study the seasonal consumption patterns of tourists in Estonia .
As part of the D4D competition, researchers studied a wide range of topics ranging from social behaviors, economics, health, transportation, and mobility. Several studies [11–13] considered mobility patterns for the purpose of improving the planning or efficiency of the transportation systems. While some articles [14, 15] targeted mobility within the largest city, Abidjan, and others considered mobility across the country, none tackled the challenge of assessing mobility models created for mature and industrialized nations on the developing nation of Ivory Coast. Additionally, ours is the first study to have considered the linguistic and cultural barriers and affinities that we have shown to be significantly stronger in the developing nation of Ivory Coast, in comparison the more industrialized nation of Portugal. Thus our study represents a novel and important contribution to understanding the challenges of creating globally applicable mobility models.
We used four datasets to assess and compare the human mobility patterns in the Ivory Coast and Portugal. The first dataset, D1 was provided by Orange Telecom as SET2, via the Data for Development (D4D) Challenge . This dataset was based on anonymized Call Detail Records (CDRs) of 2.5 billion calls and SMS exchanges between 5 million users December 1, 2011 until April 28, 2012 (150 days).
The second dataset, D2, provided 400 million anonymized CDRs across Portugal for the time period of January 1, 2006 to December 31, 2007. D2 was also provided by Orange Telecom with 2000 antennae distributed across Portugal, and the same data fields as in D1.
Datasets D3 and D4 provided a high-resolution population density data for Ivory Coast  and Portugal , respectively. To map the population data to the antennae, we created a Voronoi tessellation  of each country based on the antennae location. For the 12 locations that had 2-3 antennas in a single location, those 2-3 antennas were collapsed into a single Voronoi cell. Each antenna was assigned the total population within the corresponding Voronoi cell. Figure 1A provides a logarithmic scale population density distribution map using the data from D3 and D4. The population maps were created as an interpolation of the population density at each antenna.
Although used widely for human mobility studies, mobile phone data provides only a proxy for human mobility, for example, callers are tracked only to the spatial resolution of the antenna (which may be up to 70 km, depending on tower height and terrain), and usually only when the phone is in use while not everyone uses a mobile phone while traveling. However, even in developing regions, mobile phone penetration is high. Ivory Coast has 85% mobile phone coverage, with Orange Telecom (the provider of the data for Ivory Coast and Portugal used in this study) being the top mobile phone provider having a market share of 42.5%. In general mobile phone penetration in Portugal is almost absolute, while the total number of phone accounts is even higher than the total number of people - 142% of the country population owns a mobile device, with Orange Telecom at 19% of market share.
Collective mobility patterns
We also investigated regional differences in the mobility patterns. In both cases of Ivory Coast and Portugal, we identified the first level administrative boundaries as the highest country-defined level of partitioning. For Ivory Coast these are called ‘régions,’ while in Portugal they are referred to as ‘districts.’ We partitioned the mobility data by the different level-one administrative regions and overlaid the same density functions specific for each administrative region on the same plots above. Different administrative regions are identified by different scatter marker types and colors. We observed the same truncated power law behavior across the different regions, but the Ivory Coast regions exhibited significantly greater diversity than similarly defined regions in Portugal. This would indicate that in Ivory Coast the likelihood that people migrate and commute with respect to distance is much more dependent on what part of the country they are in, as opposed to in Portugal where the different administrative regions show very little diversity from each other.
Another important metric for assessing mobility patterns is the radius of gyration. As defined in , the radius of gyration for each caller is the characteristic distance traveled by each caller when observed up to time t, and is computed as the probability distribution function of the mean squared variance of the center of mass of each user’s set of catchment locations. The results are plotted in Figure 2B, with t= the period of data collection for each of the datasets.
The distributions plotted in Figure 2B and show that the bulk mobility data from Ivory Coast adheres well to the scale-free framework proposed in . The similarity in the bulk mobility characteristics between Ivory Coast and Portugal serves to strengthen the argument that we can make valid comparisons between the two datasets, as described in the sections below.
Daily commuting patterns are a critical component of any region’s mobility requirements. Displacement is defined as movement from one cell tower to another cell tower between two consecutive calls, and is a key marker for assessing mobility. To focus on daily commuting patterns, we excluded data collected during weekends, and computed the fraction of inter-call events that were accompanied by displacements in a moving 40-minute window of time for Ivory Coast and Portugal. We averaged the fraction of displacement for each 40-minute window across 45 weekdays to get a 24-hour temporal profile of the probability of displacement during a workday.
The first and probably the most significant difference is the absolute difference in the probability of displacement, which can be seen in Figure 2C. We observe that in Portugal, in a given period, people are much more mobile compared to their counterparts in Ivory Coast.
Both countries exhibit a commuting pattern; there is a sharp rise in the probability of displacement around 7-9 a.m. The evening decline is not as sharp, suggesting that people leave work at different times in the evening.
Significant quantitative differences between the countries can also be seen throughout the day. In Portugal, people in Lisbon and across the nation exhibited similar likelihood to commute during the busiest hours. However, a significantly higher percentage of people in Abidjan were mobile than across the nation. Additionally, while displacement levels in Abidjan and across Ivory Coast were similar during the lowest period (4-7 a.m.). Displacement for the same period is significantly higher for Portugal than for Lisbon, and is likely an indicator of more significant numbers of suburban commuters in Portugal than in Ivory Coast.
Figure 2D provides a comparison of the mean migration distances between the 2 countries for the same period. Here again, the average distance traveled is significantly less in Ivory Coast and its capital city, than in Portugal. In the country-wide data, we observe a sharp increase in the mean inter-event displacement distance near the morning peak commute (around 5-9 a.m.) in both Ivory Coast and Portugal. However, the spike in distance encountered in Lisbon during morning commute does not occur in Abidjan. This difference may be indicative of people both living and working in close proximity in Abidjan, as opposed to commuting in from outside or across the city as is often the case in developed regions with more comprehensive public transport facilities.
We examined the country-specific commuting pattern more closely by looking at how the distance commuted may affect the daily behaviors. The observed distances traveled were binned (0-1 km, 1-5 km, 5-10 km, 10-20 km, and 20-50 km), and the daily temporal profile of the probability of displacement was computed for each bin for the two countries, as shown in Figure 2E.
The temporal profiles for both Ivory Coast and Portugal show a bimodal pattern in Figure 2E. For Ivory Coast the morning peak is around 7 a.m. and the evening peak is around 8 p.m., as opposed to roughly 10 a.m. and 6 p.m. for Portugal. Note that the peaks for Portugal are much sharper than those of Ivory Coast. Overall these differences indicate a shorter prime commuting period for Portugal, which is likely indicative of the ability for commuters to travel more efficiently to their destinations.
Large networks, such as the telecommunications or transportation networks of a nation, often exhibit community structure, i.e., the organization of vertices into clusters with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Identifying the community structure in such networks has many applications, such as better placement and provisioning of services. Recently, this type of community structure analysis has been performed on land-line communications in Great Britain , mobile connections in Belgium , United States , and various other countries across Europe, Asia, and Africa [22, 23]. While there is research investigating the impact of physical human mobility on the space-independent community structure  there has been a lack of similar research for developing nations.
Network modularity  is a measure of the strength of the division of a network into clusters. Networks with high modularity have dense connections between nodes within clusters, and sparse connections between nodes in different clusters. Modularity is computed as the fraction of edges that fall within a cluster, minus the expected such fraction if the edges were distributed at random with respect to the node strength distribution. The value of modularity lies in the range , and is positive if the edges within groups exceeds the number expected on the basis of chance.
where is the weight of the link from i to j, is the sum of the weights from node i, is the community that node i was assigned to, , and is 1 if and 0 otherwise.
High modularity in mobility networks may point to an efficient organization of residences, employment, and services all in close proximity, or it may point to restrictive policies or infrastructures that limit free movement across communities. We were interested in the community structure of developing nations, such as Ivory Coast, especially in comparison to more developed nations, such as Portugal.
We used datasets D1 and D2 to build the human mobility networks and identify the community structure of antennae within the Ivory Coast and Portugal. We set nodes to the locations of each cell tower, and edges to the total number of migrations of all people that placed two consecutive calls between the two nodes within a time window of 24 hours. We tested the following Community Detection algorithms: Louvain , Le Martelot , Newman , Infomap , and a new method of community detection suggested in . We computed the modularity of community structures identified by each of these methods. The method described in  provided the highest modularity, and was subsequently chosen to be used for this part of the study. Figure 1C graphically compares the communities identified (in color) with their first level administrative boundaries (outlined in black).
Comparison of different similarity indices to compare similarity between community partitioning (generated from network of human mobility) and the respective administrative boundaries
Adjusted Rand 
While this significant difference in community and official boundary alignments may be attributable to the layout of infrastructure along official boundaries, we began to question whether there might be more fundamental differences. Previous studies have shown that other factors, such as geographical features, can play an important role in how communities are formed and services are sought [19, 37]. However, little has been done to investigate the direct impact of culture and language on human mobility.
Ivory Coast represents an especially interesting context to investigate cultural and linguistic influences on mobility within a single nation. The Ivory Coast is a nation made up of more than 60 distinct tribes, classified into 5 principle regions . The official language is French, although many of the local languages are widely used, including Baoulré, Dioula, Dan, Anyin and Cebaara Senufo, and an estimated 65 languages are spoken in the country.
Intuitively, these cultural and linguistic differences are likely to influence mobility patterns in the region. However, it is also known that as regions become more industrialized, cultural ways are often blended or lost altogether. Portugal represents an interesting context for the latter, as Portuguese is the single national language of Portugal, and any tribal boundaries pre-date Roman times.
Due to the vast differences seen in the network community structure to administrative-defined boundaries, our goal in the next section was to understand if tribal structure of the Ivory Coast could be the influential factor of the mobility patterns in the region.
Tribal community analysis
Since the communities detected in Section IV exhibited low similarity to administrative boundaries, we began to investigate the impact of certain factors present in the Ivory Coast that may be attributable to such a substantial difference. Since people and their behaviors throughout a large portion of Africa are still impacted by their tribal affiliations while any tribal boundaries in Portugal predate Roman times, studying the tribal boundary impact in Ivory Coast represented a perfect example of this. We generated digital shape files for each of the eight distinct tribal regions in Ivory Coast . Such tribal maps do have a level of uncertainty associated with them due to migrations over time; however, we used the most recent versions of tribal boundaries available.
As a first measure of impact of tribes on mobility, we aggregated the mobility network between communities. Figure 3B provides a plot of the mobility network with each node representing a sub-tribal community, and each edge is colored on a logarithmic scale to reflect the number of migrations between the connected nodes in the mobility network. Nodes are colored with the same logarithmic scheme and represent the sum of all migrations into this location. The intra-tribal community mobility network is plotted separately from the inter-tribal community mobility network, in order to facilitate comparison. The intra-tribal network plots only those edges between communities in the same tribe. The inter-tribal network plots only edges between communities of differing tribes.
This diagram provides a first insight into tribal influences on mobility. Note that the number of intra-tribal migrations (as indicated by the color coding of edges) dwarfs the number of inter-tribal migrations. Additionally, the inter-tribal migrations are largely dominated by connections to the largest city, Abidjan.
We also quantified the strength of these partitioning ties by computing the network modularity of the sub-tribal communities versus that of the administrative boundaries. The network modularity of the sub-tribal communities was 0.6548, in comparison to a network modularity of 0.6158 for the administrative boundaries. Given that the number of regions for both partitions (sub-tribal and sub-prefecture) is very similar, this increase of network modularity corresponding to sub-tribal communities by the definition of network modularity shows again that mobility patterns have a stronger connection in a sub-tribal country partitioning compared to that of an administrative partitioning.
Modeling human mobility
where i and j are origin and destination locations with populations and respectively at a distance of from each other. A is a normalization factor and γ is an adjustable parameter chosen to fit the data.
To determine which model would be more appropriate we plotted the normalized flux between two locations separated by a given distance in Figure 4C. We fit these normalized migrational fluxes to both a power-law and exponential fit (blue and red lines) and obverse that the power-law distribution is much more consistent for modeling our data (which is also confirmed by comparing mean absolute percent error values reported on the plot). Therefore, for the remainder of this study we used the gravity model in Equation (2).
In order to apply the model one must inherently define the appropriate spatial resolution (i.e., the areas to model migrations between/within). By monitoring the resulting accuracy of the model, it is possible to gain insight on what type of partitioning of an area will most effectively allow for human mobility to be modeled. We started by investigating if the use of these sub-tribal communities would provide an advantage in modeling the mobility network of the Ivory Coast compared to an administrative (subprefecture) partitioning.
We computed the Gravity model using dataset D1, and specifically modeled the migrations for both administrative and sub-tribal partitioning of the country, and subsequently tested the accuracy of the model from the mean average percent error (MAPE) with respect to the true network of human mobility. MAPE has been shown to be a very effective measure of error in model predictions, especially when considering population forecasting [39, 41].
where is the actual value, is the forecasted value, and n is the number of data points. Therefore, an inaccurate model will subsequently yield a high MAPE value; whereas an accurate model will yield a much smaller MAPE value.
where i and j are origin and destination locations with populations and respectively, at distance from each other, with representing the total population in the circle of radius centered at i (excluding the source and destination population). signifies the total outgoing flux that originates from region i. Since, our system represents a finite space (the regions and boundaries within a country), we subsequently normalize by a factor of where M is the total sample population in the system .
We also partitioned the mobility model predictions according to intra-tribal and inter-tribal flux in order to quantify the strength of the connectivity of the tribes. Quantitatively, the MAPE for the inter-tribal mobility was 11.3% higher than the MAPE of the intra-tribal mobility predictions, and supports the dominant pattern of intra-tribal migrations over inter-tribal migration which may require special consideration in terms of mobility modeling. Again, the fact that the Radiation model produces more accurate results for migrations within a single tribe compared to those between tribes suggests that the tribes themselves are playing a key role in the overall improved accuracy of the model.
Figures 5B-D compare the probability of migration predicted by both the Gravity and Radiation Models, to the actual migration percentages as computed from CDR data. While the CDR data is not the ground truth for migration, for example because it is a sample of the total population and is tracked only to the antenna level, it is also the basis for both models and thus represents a valid comparison for this study. As a more direct comparison of accuracy, Figure 5E provides the error (MAPE) for the models plotted in Figures 5B-D. For the Ivory Coast, these figures show the higher accuracy (i.e., lower error) of Radiation Model for both administrative and sub-tribal communities, and it shows that using the Radiation Model with sub-tribal communities provides the highest accuracy (i.e., lowest MAPE).
Figure 5D illustrates the Portuguese administrative municipality boundaries perform well for both the Radiation and Gravity Models. This may be indicative of municipal boundaries that were designed to align with cultural and social communities or that cultural and social communities have adapted to fit administrative boundaries.
However, we believe a more likely explanation is the growing homogeneity of language and culture that comes with maturing industrialization and urbanization. This is reflected in the predominance of Portuguese as the national language in Portugal, compared to the more than 60 local languages spoken in Ivory Coast.
Models of mobility, migration, and interaction that are conceptualized in mature and industrialized regions may not directly map to developing regions with more pronounced cultural and linguistic differences. Such models need to better account for these differences.
If administrative boundaries are drawn and services are placed based on models that do not accurately reflect these influences, results could include inefficiencies, leading to inequality of services (e.g., longer or less accessible commutes), and potentially discrimination and alienation of segments of the population.
Techniques for assessing the strength of borders in a given regional partitioning scheme are critical to ensuring the accuracy of mobility and migration modeling, and, perhaps more importantly, to enabling sound decision-making by authorities tasked with setting effective administrative boundaries.
In the following section, we propose novel techniques for quantifying the impact of a regional partition scheme on model accuracy, and for assessing the strength of borders in a given partitioning scheme.
Assessing regional affinities and border strength
In previous sections, we demonstrated the issues arising from using mobility models, such as the Gravity and Radiation models, on regions where the partitioning scheme, such as Ivory Coast administrative boundaries, does not reflect the regional affinities and border strengths these models assume. We illustrated techniques to create more appropriate partitioning schemes, such as the tribal communities, and demonstrated the ability to achieve higher accuracy mobility modeling using this improved partitioning.
However, it may not always be possible to simply re-draw borders. Instead, tools are needed to assess the efficacy of an existing partitioning scheme, in terms of the affinities within the identified borders and the strength of the borders.
In this section, we propose two novel techniques to address the above challenge. Firstly, we present a metric to determine whether affinities exist within borders that may impact the accuracy of mobility modeling. We test our metric on the tribal and administrative boundaries used in the previous sections. Secondly, we propose a technique to assess the strength of existing borders, and we demonstrate our technique on the existing administrative borders of Portugal and Ivory Coast. Our techniques use the same mobility data used in previous sections and can be performed on any regional partitioning scheme, and thus provide valuable tools to mobility researchers and to urban planners.
The accuracy of both the Gravity Model and the Radiation Model depends on the ability to accurately model, for a given time epoch, movement from any region to any other region, and lack of movement to another region. Inter-region movements are driven by opportunities and resources, which are reflected in the population of region, and constrained by distance. Intra-region affinities, such as physical proximity to home, work, family, and friends, tend to limit movement from a given region. Improperly partitioning a region to account these affinities results in over or under predicting flux, and therefore, poor model performance.
We propose a metric to assess whether such affinities exist, and test that metric on the tribal and administrative boundaries used in previous sections. To compute this metric, we segregated all migrations into two categories: intra- and inter-regional migrations. Referring to Figure 3A, an inter-region migration is a migration that crosses a color boundary. Likewise, an intra-region migration is a migration that does not cross a color boundary. When a model over-predicts the number of inter-region migrations, it is under-estimating the strength of the affinities within that region.
where R is the modeled flux, T is the true flux, and n is the total number of predictions made. In other words, S represents the average ratio of modeled to predicted fluxes in the system. Therefore, relatively high values of S indicate the model is generally over predicting, while lower values indicate a general under prediction by the model. We compute separate and values, where the value includes only inter-region flux and includes only intra-region flux.
Just as human mobility may be constrained by affinities, it is similarly impacted by the strength of surrounding borders. Mobility models must accurately and succinctly reflect borders that may be physical, such as gates or other guards, or abstract, such as lack of opportunities. However, neither the Gravity Model nor the Radiation Model provide a means to assess border strength.
In this section, we propose a novel metric for assessing the strength of borders within a region. We demonstrate our metric by computing the strengths of the administrative borders of Portugal and Ivory Coast used in previous sections. As predicted by the Gravity and Radiation Models, we show that borders surrounding the heavily populated region of Abidjan are more penetrable than borders elsewhere in the Ivory Coast. Similarly, our results show that the borders throughout the country of Portugal are much more uniform, even showing consistency with the more heavily populated Lisbon. Our metric allows us to show that while these two cities, located in two very different regions, share a higher penetrability of borders than surrounding regions, they also exhibit significant differences in the distribution of border strength.
where i and j are nodes, is the weighted directed matrix of human mobility, , , and . Finally, is the difference between the actual number of migrations and the expected number of migrations from i to j.
Therefore, for any given node i, . Further, indicates node i is as strongly connected to at least one partition other than the partition to which it is assigned, and thus we say node i is not stable within the partition it is located. Negative values of indicate node i is more strongly connected to partitions other than the partition to which it is assigned (i.e., the considered partitioning is not optimal in terms of the modularity score), and positive values of indicate node i is more strongly connected to its assigned partition .
Note also that by computing from the linearly interpolated values across the entire region, our approach enables a border strength metric that reflects the aggregate impact of cells that are not well connected, even if those cells are not physically close to borders.
The distributions illustrated in Figure 8 also provide interesting insights into mobility in Ivory Coast and Portugal. More specifically, notice the tight clustering of border strengths for Abidjan under the Tribal partitioning (Figure 8A) versus more widely distributed border strength values for the rest of Ivory Coast. This supports our early findings that Tribal boundaries play a key role in mobility throughout the country, except in highly industrialized regions such as Abidjan.
Africa is a continent that has been shaped by human migration over tens of thousands of years. Indeed, migrations within and beyond the African borders have recently been shown as influencing all civilizations as we know them. However, until recently, there has been a dearth of data on the forms and patterns of migration within the nations of Africa. Moreover, much of the mobility research is based on theories that have emerged from highly industrialized nations and lack validation in the context of developing environments.
Our study has demonstrated that many of these conceptions are not necessarily applicable in the African context. We have made these differences clear by comparing our findings in Ivory Coast to one such industrialized nation, Portugal. For example, we have shown that the probability of displacement during normal commuting hours in Portugal is often nearly double that of Ivory Coast for the same time of day. Similarly, average distances traveled by commuters in Portugal is nearly double that of commuters in Ivory Coast.
While differences in the likelihood of travel and average distance travel can be attributed to quantitative differences in infrastructural support for mobility this already strongly affects the whole mobility picture leading to a number of quantitative dissimilarities. Our study shows evidence of more fundamental differences in infrastructural support for mobility, such as tribal, cultural, and lingual differences. In addition, we demonstrate that the similarity between administrative boundaries and communities detected in mobile phone data is markedly lower in Ivory Coast than in Portugal.
By identifying the tribal influence on mobility in the Ivory Coast, we were able to illuminate further differences in mobility patterns. For example, we were able to show intra-tribal migrations were much more frequent than that of inter-tribal migrations over the same distance and therefore are under or overestimated by the models. Taking this into account by exploiting our tribally aligned communities for the mobility models drastically improves modeling of human mobility in Ivory Coast. We validated this higher accuracy by computing the MAPE across all data points for both models, and found a 20% to 50% higher error for the models using administrative boundaries. We also validated our results by computing the distribution of migrations by distance migrated and found that by using this sub-tribal method of spatial units definition in modeling human mobility we were able to improve the accuracy of the models so drastically, that the Ivory Coast performed even better than its developed country-counterpart, Portugal.
We propose novel techniques for assessing the strength of borders within a regional partitioning scheme, and for assessing the impact of inappropriate partitioning on model accuracy. Our results offer improved insights on why models developed for mature and stable regions may not translate well to developing regions, and provide tools for urban planners and data scientists to address these deficiencies.
We believe that the findings of this study demonstrate important differences that exist between developing and industrialized regions. Using these two countries as an example, we are motivated to further explore these differences by considering more countries and areas with diverse cultural, economical and social backgrounds.
The authors wish to thank Orange and D4D Challenge for providing the datasets used throughout this study. We further thank Ericsson, the MIT SMART Program, the Center for Complex Engineering Systems (CCES) at KACST and MIT CCES program, the National Science Foundation, the MIT Portugal Program, the AT&T Foundation, Audi Volkswagen, BBVA, The Coca Cola Company, Expo 2015, Ferrovial, The Regional Municipality of Wood Buffalo and all the members of the MIT SENSEable City Lab Consortium for supporting the research.
- Robertson C, Sawford K, Daniel SL, Nelson TA, Stephen C: Mobile phone-based infectious disease surveillance system, Sri Lanka. Emerg Infect Dis 2010., 16(10): Article ID 1524 Article ID 1524View Article
- Gonzalez MC, Hidalgo CA, Barabasi A-L: Understanding individual human mobility patterns. Nature 2008, 453(7196):779–782. 10.1038/nature06958View Article
- Ratti C, Williams S, Frenchman D, Pulselli R: Mobile landscapes: using location data from cell phones for urban analysis. Environ Plan B, Plan Des 2006., 33(5): Article ID 727 Article ID 727View Article
- Reades J, Calabrese F, Ratti C: Eigenplaces: analysing cities using the space-time structure of the mobile phone network. Environ Plan B, Plan Des 2009, 36(5):824–836. 10.1068/b34133tView Article
- Simini F, González MC, Maritan A, Barabási A-L: A universal model for mobility and migration patterns. Nature 2012, 484(7392):96–100. 10.1038/nature10856View Article
- Kang C, Sobolevsky S, Liu Y, Ratti C: Exploring human movements in singapore: a comparative analysis based on mobile phone and taxicab usages. In Proceedings of the 2nd ACM SIGKDD international workshop on urban computing. ACM, New York; 2013:1.View Article
- Blondel VD, Esch M, Chan C, Clerot F, Deville P, Huens E, Morlot F, Smoreda Z, Ziemlicki C (2012) Data for development: the d4d challenge on mobile phone data. Preprint arXiv:1210.0137
- Eagle N: Behavioral inference across cultures: using telephones as a cultural lens. IEEE Intell Syst 2008, 23(4):62–64.View Article
- Eagle N, de Montjoye Y, Bettencourt LM: Community computing: comparisons between rural and urban societies using mobile phone data. International conference on computational science and engineering, vol 4, 2009. CSE’09 2009, 144–150. IEEE IEEEView Article
- Ahas R, Aasa A, Mark Ü, Pae T, Kull A: Seasonal tourism spaces in Estonia: case study with mobile positioning data. Tour Manag 2007, 28(3):898–910. 10.1016/j.tourman.2006.05.010View Article
- Berlingerio M, Calabrese F, Di Lorenzo G, Nair R, Pinelli F, Sbodio ML: Allaboard: a system for exploring urban mobility and optimizing public transport using cellphone data. Proceedings of the third international conference on the analysis of mobile phone datasets (NetMob) 2013.
- Liu F, Janssens D, Wets G, Cools M: Profiling workers’ activity-travel behavior based on mobile phone data. Proceedings of the third international conference on the analysis of mobile phone datasets (NetMob) 2013.
- Zilske M, Nagel K: Building a minimal traffic model from mobile phone data. Proceedings of the third international conference on the analysis of mobile phone datasets (NetMob) 2013.
- Naboulsi D, Fiore M, Stanica R, et al.: Human mobility flows in the city of Abidjan. 3rd international conference on the analysis of mobile phone datasets 2013.
- Yan X-Y, Zhao C, Wang W: Predicting human mobility patterns in cities. 3rd international conference on the analysis of mobile phone datasets 2013.
- AfriPop: Cote D’Ivoire. http://www.clas.ufl.edu/users/atatem/index_files/CIV.htm
- Portal do Instituto Nacional de Estatística. http://www.ine.pt
- Voronoï G: Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Deuxième mémoire. Recherches sur les parallélloèdres primitifs. J Reine Angew Math 1908, 134: 198–287.MathSciNet
- Ratti C, Sobolevsky S, Calabrese F, Andris C, Reades J, Martino M, Claxton R, Strogatz SH: Redrawing the map of Great Britain from a network of human interactions. PLoS ONE 2010., 5(12): Article ID 14248 Article ID 14248View Article
- Expert P, Evans TS, Blondel VD, Lambiotte R: Uncovering space-independent communities in spatial networks. Proc Natl Acad Sci USA 2011, 108(19):7663–7668. 10.1073/pnas.1018962108View Article
- Calabrese F, Dahlem D, Gerber A, Paul D, Chen X, Rowland J, Rath C, Ratti C: The connected states of America: quantifying social radii of influence. 2011 IEEE third international conference on privacy, security, risk and trust (passat) and 2011 IEEE third international conference on social computing (socialcom) 2011, 223–230. IEEE IEEEView Article
- Sobolevsky S, Szell M, Campari R, Couronné T, Smoreda Z, Ratti C: Delineating geographical regions with networks of human interactions in an extensive set of countries. PLoS ONE 2013., 8(12): Article ID 81707 Article ID 81707View Article
- Sobolevsky S: Digitale ansätze für eine regionale abgrenzung. Bauwelt fundamente 150. In Die stadt entschlüsseln: wie echtzeitdaten den urbanismus verändern. Edited by: Offenhuber D, Ratti C. Birkhäuser, Basel; 2013.
- Newman ME: Modularity and community structure in networks. Proc Natl Acad Sci USA 2006, 103(23):8577–8582. 10.1073/pnas.0601602103View Article
- Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E: Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008., 2008(10): Article ID 10008 Article ID 10008View Article
- Le Martelot E, Hankin C: Multi-scale community detection using stability as optimisation criterion in a greedy algorithm. KDIR 2011, 216–225.
- Newman ME: Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 2006., 74(3): Article ID 036104 Article ID 036104MathSciNetView Article
- Lancichinetti A, Fortunato S: Community detection algorithms: a comparative analysis. Phys Rev E 2009., 80(5): Article ID 056117 Article ID 056117View Article
- Sobolevsky S, Campari R, Belyi A, Ratti C (2013) A general optimization technique for high quality community detection in complex networks. Preprint arXiv:1308.3508
- Wallace DL: Comment. J Am Stat Assoc 1983, 78(383):569–576.
- Rand WM: Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 1971, 66(336):846–850. 10.1080/01621459.1971.10482356View Article
- Jaccard P: Étude comparative de la distribution florale dans une portion des alpes et des jura. Bull Soc Vaud Sci Nat 1901, 37: 547–579.
- Fowlkes EB, Mallows CL: A method for comparing two hierarchical clusterings. J Am Stat Assoc 1983, 78(383):553–569. 10.1080/01621459.1983.10478008View Article
- Heckerman D, Meila M (1998) An experimental comparison of several clustering and initialization methods. Technical report, Citeseer
- Hubert L, Arabie P: Comparing partitions. J Classif 1985, 2(1):193–218. 10.1007/BF01908075View Article
- Larsen B, Aone C: Fast and effective text mining using linear-time document clustering. In Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York; 1999:16–22.View Article
- Onnela J-P, Arbesman S, González MC, Barabási A-L, Christakis NA: Geographic constraints on social network groups. PLoS ONE 2011., 6(4): Article ID 16939 Article ID 16939View Article
- Cote d’Ivoire - Encyclopedia Britannica Academic Edition Inc. http://www.britannica.com/EBchecked/topic/139651/Cote-dIvoire
- Swanson DA, Tayman J, Bryan T: Mape-r: a rescaled measure of accuracy for cross-sectional subnational population forecasts. J Popul Res 2011, 28(2–3):225–243. 10.1007/s12546-011-9054-5View Article
- Ravenstein EG: The laws of migration. J Stat Soc 1885, 48(2):167–235.
- Tayman J, Swanson DA: On the validity of mape as a measure of population forecast accuracy. Popul Res Policy Rev 1999, 18(4):299–322. 10.1023/A:1006166418051View Article
- Masucci AP, Serras J, Johansson A, Batty M: Gravity versus radiation models: on the importance of scale and heterogeneity in commuting flows. Phys Rev E 2013., 88(2): Article ID 022812 Article ID 022812View Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.