- Regular article
- Open Access
Commuting patterns: the flow and jump model and supporting data
- Levente Varga^{1},
- Géza Tóth^{2} and
- Zoltán Néda^{1}Email authorView ORCID ID profile
- Received: 19 April 2018
- Accepted: 7 October 2018
- Published: 11 October 2018
Abstract
A simple model, named the flow and jump model (FJM) is used for describing commuter fluxes at different distances. The model is based on a master equation which allows a local net probability flow and non-local jumps. FJM is in principle a one-parameter model, however it is found that by fixing this parameter we get a parameter free model, similar with the radiation model. We find that FJM offers an improved description for commuting data from USA, Italy and Hungary. For a special choice of the model parameter FJM leads to the radiation model.
Keywords
- Human mobility models
- Commuters data
- Population density
1 Introduction
Commuter mobility patterns are in the focus of many recent studies. The problem by its nature belongs to the research field of human geography, sociology and economics. Nowadays however, researchers from many other fields became interested in the topic. The interest in such studies can be explained by the fact that many large electronic datasets became available for researchers, allowing to test both the assumptions and main results of the models. As a special case of human mobility, statisticians and data scientist are interested in universal patterns that govern the commuter fluxes at different spacial scales. Physicists and mathematicians are interested in simple models capable of explaining the observed patterns. A detailed review for the state of the art of the field of human mobility is given in the recent review article of Barbosa et al. [1].
Models for community fluxes, motivated phenomenologically by some simple socio-economic or probabilistic arguments, were proposed already in the early 1940 by Stouffer (the intervening opportunities model) [2] and by Block & Marschak in 1960 (the random utility model) [3]. Analogies with some classical physics phenomena were exploited by the very popular gravity and generalized potential models [4].
From modeling perspective a great leap in the understanding of human mobility patterns represented the radiation model introduced by Simini et al. [5]. In contrast with earlier models that were phenomenologically argued, the radiation model started from a basic socio-economic optimization assumption and derived a simple and compact formula for commuter fluxes. Relative to the earlier used models the compact result derived in [5] has also the advantage that it is parameter free. When compared however with real commuting and population data, the model contains an undetermined proportionality constant that makes the connection between the population and available job number. In such sense one can argue that this model is a one free parameter model. Other models of similar complexity, built also on realistic assumptions are the population-weighted opportunities model (PWO) [6] and a novel version of it where also memory effects are considered [7]. Recently a new, parameter-free model was introduced by Liu and Yan [8]. Their basic assumption is that individuals select destination locations that present higher opportunity benefits than the ones at the origin and the intervening opportunities between the origin and destination.
The radiation model was generalized for continuous population distribution [9] and was also made more realistic by allowing a realistic job selection for the individuals. This new model, the radiation model with selection offered an improved fit for the commuting data in USA. Further improvement for the simple radiation model was considered by taking into account also the travel cost involved in commuting (see for example [10]). In the line of this model the travel cost optimized radiation model introduced by us recently [11] offered an improved description for the commuting fluxes in Hungary. The main drawback of all these later generalization to the radiation model is that the parameter-free beauty of the radiation model is lost.
Here we offer a further generalization for the original radiation model, and prove it’s advantages relative to the earlier models using large-scale population density and commuter flux data from USA, Italy and Hungary. The nice aspect of this generalization is that our model is again a parameter-free model, since the only fitting parameter is fixed to a universal value.
2 Modelling framework
The gravity model (GM) is probably the most known approach to describe empirically the commuter fluxes between cities [12]. It is based on a phenomenological analogy with gravity, assuming that the interaction between two regions or cities depends in an inverse proportionality with the distance raised at a positive power and in direct proportionality with the size of the two regions/cities. Contrary with what is usually believed, GM is not only a simple analogy there are also theoretical arguments in favour of it. The oldest one is probably the one using the maximal entropy hypothesis [13, 14]. Other successful attempts are based on the principle of utility maximization in economics. Both deterministic [15, 16] and random utility theories [17] were considered.
3 Data source and format
a. Settlement data where the settlement code and their latitudes and longitudes are given. In the case of USA, the total number of settlements is \(Q = 73\text{,}803\). These geographical locations are the source and targets for commuting. The data is in the format given below:
set. code | lat | lon |
1 | 32.4771763256 | −86.4901731173 |
2 | 32.474292121 | −86.4733798888 |
3 | 32.4754563613 | −86.460168641 |
b. Commuting data, containing the source and targets for \(4\text{,}156\text{,}426\) directed travels to work. The data has the following structure: the first and second column contains the source and target settlements code and the third column gives the number of commuters. Below we illustrate the format of this data:
source set. | target set. | num. of com. |
9719 | 9719 | 20,950 |
9703 | 9719 | 785 |
29,719 | 29,719 | 540 |
69,719 | 69,719 | 490 |
69,719 | 69,720 | 480 |
9711 | 9719 | 465 |
c. Population distribution data. The original dataset contains \(11\text{,}078\text{,}286\) square like cells of 1 km^{2} area with its population, the latitude and longitude for the middle point. In order to speed up our calculations we have spatially renormalized this data and obtained a less accurate resolution with 4 km^{2} size cells. This is done by collapsing the data of four neighbouring cells and averaging their latitudinal and longitudinal coordinates. As result we ended up with \(1\text{,}230\text{,}920\) cells containing a total population \(W=308\text{,}745\text{,}231\). The data we have worked with has the following structure:
pop. num. | lat | lon |
18.0 | 51.8642065666 | −176.664361722 |
30.0 | 51.8621521667 | −176.6534376 |
9.0 | 51.8700427111 | −176.644826767 |
0.0 | 51.8704367889 | −176.633367733 |
7.0 | 51.8785901 | −176.629460933 |
219.0 | 51.8383112778 | −176.512803256 |
From the above three datasets one can compute the \(P_{>}(W)\) dependency. For USA we have used yet another dataset to prove the linear proportionality between the number of job openings and total population for a geographical region. For this we obtained the number of listed jobs for each state of the continental USA using the site [23]. In the day we have processed the data (12.02.2018) we found a total of \(2\text{,}596\text{,}391\) jobs. The population of the states was obtained using the estimate between 2006 and 2011, available on the Internet [22].
Apart of the large-scale data available for USA we have used two smaller-size datasets for Hungary and Italy. These two additional datasets contain the same three data subsets: settlement data, commuting data and population distribution data. The population distribution data was used in its original form with cells of sizes 1 km^{2}.
For Hungary we used the same commuting data as in [11]. Commuting data is between \(Q = 3176\) settlements, it contains \(81\text{,}664\) commuter routes [24] and the spatial distribution of population is for the \(W = 9\text{,}972\text{,}000\) total inhabitants [25] as measured in the 2011 population census.
The data for Italy contains \(Q = 8093\) settlements, \(556\text{,}120\) commuter routes and it is from the Italian population census from 2011 [26]. The total population \(W = 55\text{,}605\text{,}065\) is mapped in cells of 1 km^{2} area [27].
4 Data processing
During the data processing, we select one by one the settlements i as source for commuting and construct the disks with radius \(d(i,j)\), reaching to the target settlement j. This is illustrated schematically in Fig. 2. We count the total population \(w_{i}[j]\) inside this disk and record the number of commuters \(f_{i}(j)\) starting from settlement i and traveling to settlement j.
Having the data \(d(i,j)\), \(f_{i}(j)\), and \(w_{i}[j]\) for all the settlement pairs \((i,j)\) we compute the experimental \(P_{>}(W)\) probabilities.
We ordered the settlements according to their distance relative to i. Let \(h_{i}^{[k]}\) be the index of the settlement that is the kth one in this row (for example, \(h_{i}^{[1]}\) is the index of the settlement that is the closest to settlement i and \(h_{i}^{[2]}\) is the index of the settlement that is the second closest to i). We denote by \(s(i,w)\) the smallest number of settlements for which the population inside a disk centered in i becomes larger (or equal) than w.
5 Results and discussions
RM | RMwS | TCORM | GM | FJM | ||||
---|---|---|---|---|---|---|---|---|
μ | μ | q | μ | λ | α | β | μ | |
USA data | 0.0000308 | 0.0000308 | 1.0 | 0.000119 | 0.0056 | 1.2 | 1.2 | 0.000062 |
\(\mathbf{R}^{2}\) | 0.971 | 0.971 | 0.992 | 0.993 | 0.993 |
6 Conclusions
In order to describe the statistics of commuter fluxes at different distances we introduced the FJM model based on a mean-field type dynamical approach. The model takes into account indirectly that commuting to larger distances is costly and less probable. Relative to the classical models it offers an improved fit for commuter fluxes in USA, Hungary and Italy. The probability that commuters are traveling for their jobs over a population W is compactly given by equation (16). The model is a two-parameter one, although we have shown that one parameter can be fixed, so that all studied datasets are reasonable well explained. In such sense the model becomes similarly with the RM model a one-parameter one, and improves the RM model in a considerable manner.
In order to comment on the results obtained for USA, Italy and Hungary we review from Table 2 the best fit parameter μ obtained with the FJM model. The parameter μ characterises both the availability of jobs per population and the attractiveness of these jobs to jobseekers. A higher value of μ suggests that there are many jobs relative to the population, jobseekers are aware of them and consider them for a potential commuting. A smaller μ value suggests that the number of available jobs per population is smaller and jobseekers are very selective for commuting. The obtained fitting parameter for μ are in good agreement with the given heuristic justifications and confirms the known social and economic profile of USA, Italy and Hungary. Commuting is more common in USA relative to Europe and there are more available commuting jobs per population. Related to the value of the a exponent in equation (16), one can also draw some interesting conclusions. The difference from the original radiation model (where we have \(a=2\)) suggests an already known issue, i.e. commuters are selective, not all available jobs are acceptable for them and travel cost has to be taken into account in accepting a commuting job [7–11]. Due to this the \(C/\eta\) value is smaller than the one for a simple salary optimization mechanism where the commuters accept the closest job that improves their salary at home (assumption of RM). This can be done either by lowering the C constant or by increasing the value of η, or changing both of them. The seemingly universal value of \(a=7/4\) remains however a puzzle motivating further studies.
In conclusion, we believe that the FJM model proposed in the present study lies on simple and reasonable assumptions and the studied experimental data supports it’s predictions.
Declarations
Acknowledgements
Not applicable.
Availability of data and materials
The used data is available on the Internet by following the links from the “Data source and format” and “Data processing” sections. The processed data in the format indicated in the “Data source and format” section is available in the Figshare repository, doi:10.6084/m9.figshare.6151130, URL: https://figshare.com/s/b86965bb06ce018f52bf. In this repository one will find: the figs_data.zip file containing the data used for plotting the Figures, the Hungary.zip, Italy.zip and USA.zip files containing the processed data for Hungary, Italy and USA, respectively. The fig2.gdf file is a GUESS Graph Data Format file, which is editable in a simple text editor.
Authors’ information
Z.N. is professor of theoretical physics, working in the area of interdisciplinary applications of statistical physics. He uses both analytical and computational models to understand complex phenomena from physics, economics, biology and sociology. G.T. is a senior investigator at the Hungarian Central Statistical Office. He is specialist in geographical data collection, in handling and processing large geographical datasets. L.V. is a PhD student in computational physics with a strong background in computer science and informatics.
Funding
Work supported by the Romanian Research Council UEFISCDI, Romania through grant Nr: PN-III-P4-PCE-2016-0363.
Authors’ contributions
ZN designed the study, elaborated the FJM model and wrote up the first version of the manuscript. LV analyzed the data and draw the figures. GT collected the data, interpreted them and putted in the desired form. All authors worked on the final version of the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Barbosa-Filho H, Barthélemy M, Ghoshal G, James RC, Lenormand M, Louail T, Menezes R, Ramasco JJ, Simini F, Tomasini M (2018) Human mobility: Models and Applications. Physics Reports. https://doi.org/10.1016/j.physrep.2018.01.001 MathSciNetView ArticleGoogle Scholar
- Stouffer SA (1940) Intervening opportunities: a theory relating mobility and distance. Am Sociol Rev 5:845–867 View ArticleGoogle Scholar
- Block H, Marschak J (1960) Random orderings and stochastic theories of responses. In: Contributions to probability and statistics, vol 2 pp 97–132 Google Scholar
- Lukermann F, Porter PW (1960) Gravity and potential models in economic geography. Ann Assoc Am Geogr 50:493–504 View ArticleGoogle Scholar
- Simini F, González MC, Maritan A, Barabási AL (2012) A universal model for mobility and migration patterns. Nature 484:96–100 View ArticleGoogle Scholar
- Yan XY, Zhao C, Fan Y, Di Z, Wang WX (2014) Universal predictability of mobility patterns in cities. J R Soc Interface 11:20140834 View ArticleGoogle Scholar
- Yan XY, Wang WX, Gao ZY, Lai YC (2017) Universal model of individual and population mobility on diverse spatial scales. Nat Commun 8:1639 View ArticleGoogle Scholar
- Liu E, Yan XY New parameter-free mobility model. Preprint. arXiv:1808.06363
- Simini F, Maritan A, Néda Z (2013) Human mobility in a continuum approach. PLoS ONE 8(3):e60069 View ArticleGoogle Scholar
- Ren Y, Ercsey-Ravasz M, Wang P, Gonzales MC, Toroczkai Z (2014) Predicting commuter flows in spatial networks using a radiation model based on temporal ranges. Nat Commun 5:5347 View ArticleGoogle Scholar
- Varga L, Tóth G, Néda Z (2017) An improved radiation model and its applicability for understanding commuting patterns in Hungary. Reg Statist 6(2):27–38 View ArticleGoogle Scholar
- Stefanouli M, Polyzos S (2017) Gravity vs radiation model: two approaches on commuting in Greece. Transp Res Proc 24:65–72 View ArticleGoogle Scholar
- Wilson AG (1967) A statistical theory of spatial distribution models. Transp Res 1:253–269 View ArticleGoogle Scholar
- Hua CI, Porell F (1979) A critical review of the development of the gravity model. Int Reg Sci Rev 4(2):97–126 View ArticleGoogle Scholar
- Sheppard ES (1978) Theoretical underpinnings of the gravity hypothesis. Geogr Anal 10(4):386–402 View ArticleGoogle Scholar
- Niedercorn JH, Bechdolt BV (1969) An economic derivation of the “gravity law” of spatial interaction. J Regional Sci 9(2):273–282 View ArticleGoogle Scholar
- Domencich T, McFadden DL (2015) Urban travel demand: a behavioral analysis. North-Holland, Amsterdam Google Scholar
- Biró TS, Néda Z (2018) Unidirectional random growth with resetting. Physica A 499:355–361 MathSciNetView ArticleGoogle Scholar
- Thurner S, Kyriakopoulos F, Tallis C (2007) Unified model for network dynamics exhibiting nonextensive statistics. Phys Rev E 76:036111 View ArticleGoogle Scholar
- CTPP 2006–2010 Census Tract Flows, Commuting data, American Community Survey. https://www.fhwa.dot.gov/planning/census_issues/ctpp/data_products/2006-2010_tract_flows/
- Dash Nelson G, Rae A (2016) An economic geography of the United States: from commutes to megaregions. PLoS ONE 11(11):e0166083 View ArticleGoogle Scholar
- 2006–2010 Population distribution, American Community Survey. https://www.census.gov/geo/maps-data/data/tiger-data.html
- 2018 USA job openings accessed at 10.02.2018. https://www.indeed.com/
- 2011 Census Tract Flow, Commuting data, Hungary. http://www.ksh.hu
- 2011 Population distribution, Hungary. http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/GEOSTAT-grid-POP-1K-2011-V2-0-1.zip
- 2011 Census Tract Flow, Commuting data, Italy. http://www.istat.it/storage/cartografia/matrici_pendolarismo/matrici_pendolarismo_2011.zip
- 2011 Population distribution, Italy. http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/GEOSTAT-grid-POP-1K-2011-V2-0-1.zip