Abstract
Uncovering human mobility patterns is of fundamental importance to the understanding of epidemic spreading, urban transportation and other socioeconomic dynamics embodying spatiality and human travel. According to the direct travel diaries of volunteers, we show the absence of scaling properties in the displacement distribution at the individual level,while the aggregated displacement distribution follows a power law with an exponential cutoff. Given the constraint on total travelling cost, this aggregated scaling law can be analytically predicted by the mixture nature of human travel under the principle of maximum entropy. A direct corollary of such theory is that the displacement distribution of a single mode of transportation should follow an exponential law, which also gets supportive evidences in known data. We thus conclude that the travelling cost shapes the displacement distribution at the aggregated level.
Introduction
Positioning systems in mobile phones and vehicles and WiFi devices in laptop computers and personal digital assistants have made quantitative analyses of human mobility patterns possible^{1,2,3,4}. These analyses have a significant potential to reveal novel statistical regularities of human behavior, refine our understanding of the socioeconomic dynamics embodying spatiality and human mobility^{5,6} and eventually contribute to controlling disease^{7,8,9,10}, designing transportation systems^{11}, locating facilities^{12}, providing locationbased services^{13,14,15} and so on.
Aggregated data from bank notes^{16}, mobile phones^{1} and onboard GPS measurements^{3} showed that the displacement distribution of human mobility, for both longrange travel and daily movements, approximately follows a power law. The scaling laws in longrange travel may result from the hierarchical organization of transportation systems^{17}, while the scaling laws in daily movements have recently been explained by the exploration and preferential return mechanism^{18}.
Thus far, we still lack solid results about human mobility patterns at the individual level. Inferring individual features from the aggregated data is very risky because the scaling law for the population could be a mixture of many individuals with different statistics^{19}. In addition, the aforementioned data are not sufficient to draw conclusions at the individual level. First, data such as GPS records from taxis and the trajectories of bank notes consist of many individual movements, but these individuals are not easy to be distinguished from each other. Second, data such as GPS records from mobile phones and the trajectories of bank notes could not accurately capture purposeful travels with explicit origins and destinations. In fact, the displacement between two activations of a mobile phone may be just a tiny portion of a purposeful trip or a combination of several sequential trips, while the displacement between two registrations of a bank note could be the result of a number of sequential trips made by different people.
Instead of using proxy data, we analyze the travel diaries of hundreds of volunteers. Though the data set is small, it contains personal profiles and explicit positions of origins and destinations, allowing quantitative and authentic analyses at the individual level. In contrast to the scaling laws in aggregated data, individuals show diverse mobility patterns and few of them display the scaling property. In fact, the trajectories of students and employees are dominated by trips connecting homes with schools and workplaces, respectively, while trips are distributed more homogeneously among different locations for others such as retirees, homemakers and unemployed people. The aggregated displacement distribution follows a power law with an exponential cutoff, which can be analytically explained by the mixed nature of human travel under the principle of maximum entropy. In addition, this theory predicts that the displacements using a single mode of transportation will follow an exponential distribution, which is also supported by the empirical data on taxi trips, car trips, bus trips and air flights.
Results
Individual mobility patterns
Our analysis of human mobility is based on a data set of 230 volunteers' sixweek travel diaries in Frauenfeld, Switzerland^{20}. This data set contains the volunteers' personal information, including age, job and sex and 36761 trip records. By calculating the spherical distance between the origin and destination from their longitudes and latitudes, we can obtain the length of each trip (see details about data in Methods).
We first measure the individual displacement distributions from the data set. Figs. 1(a)–1(c) show three typical individuals' displacement distributions (Table S1 presents all volunteers' displacement distributions), from which we cannot find any universal scaling properties. Indeed, when we use the KolmogorovSmirnov test^{21} to test whether the distributions fit power laws, we find that 87.8% of the individuals cannot pass the test (statistical validation results are listed in Table S2 and the details about KolmogorovSmirnov test are shown in Methods). This result strongly suggests the absence of scaling laws in human travel at the individual level.
To reveal the underlying structure of individual trips, we assign to each individual a mobility network, in which nodes denote locations visited by individuals, edges represent the trips between nodes and edge weight is defined as the number of corresponding trips^{22}. Figs. 1(d)–1(f) show three typical individuals' mobility networks (all networks are presented in Table S1). As shown in Fig. 1 and Table S1, for most students and employees, their edge weights are highly heterogeneous. For each individual, we call the trip corresponding to the edge with the largest weight the dominant trip and define the domination ratio d as the ratio of the weight of the dominant trip to the total weight. Fig. 2 reports the distribution of domination ratios for different groups of individuals, from which we can see that the student group has the largest d on average and the employees' average domination ratio is smaller than that of the students but larger than that of the other group.
The difference of d results from the fact that students and employees frequently travel between homes and schools/workplaces in working days but retirees or homemakers do not have to do so. The peak values in the displacement distributions of students and employees are thus usually determined by the lengths of their dominant trips. Because the lengths of dominant trips are not necessarily small, the displacement distribution for an individual is usually not rightskewed and is far different from a power law. In addition, the significant role of the dominant trip indicates that an individual's traveling process in general cannot be characterized by the Lévy flight^{16} or truncated Lévy flight^{1}.
Scaling property in aggregated data
The aggregated displacement distribution of individuals (see Fig. 3) is well approximated by a power law with an exponential cutoff P(r) ∝ r^{−1.05} exp(−r/50) (the fitness significance pvalue by the KolmogorovSmirnov test^{21} is 1.000 and the standard KolmogorovSmirnov distance D is 0.039, see Methods and Fig. S1 for details), which is similar to those observed for bank notes^{16} and mobile phone users^{1}. As shown above, this scaling property is not a simple combination of many analogous individuals. We assume that the total travel cost is C, the number of trips with cost c_{i} is n_{i}. According to the maximum entropy principle^{23}, the two constraints, Σn_{i} = N and Σn_{i}c_{i} = C, lead to the solution , where is the average travel cost. Denote the density of trips with cost c by P(c), then .
The travel cost is commonly approximated as the weighted sum c ≈ ηt + μm, where η and μ are two coefficients and t and m are the costs involving time and money, respectively. Previous empirical studies have suggested that the monetary cost is approximately proportional to the travel distance as m ≈ νr^{24}, while the travel time approximately obeys a hybrid form t ≈ ϕ ln r + ωr + ψ^{25,26}, where ν, ϕ, ω and ψ are coefficients. The logarithmic term results from the mixture of modes of transportation^{27}. Apparently, people move faster when traveling longer distances: we walk from classroom to office but take an airplane from US to China. Figure S2 reports the statistics related to travel times of the data set used in this paper. Although the data set is not large enough and contains some noisy points, overall speaking, the travel time t grows in a hybrid form as mentioned above, with ϕ ≈ 9 and ω ≈ 0.4.
Integrating the aforementioned terms, we obtain the displacement distribution P(r) ∝ (β/r + 1/κ)r^{−β} exp(−r/κ), where and . When κ is large, the distribution is approximated as a powerlaw with an exponential cutoff. Indeed, for the real data, κ = 40 and β = 0.38, so for r < 100, that is, the term can be neglected. As shown in Fig. S3, the corresponding fitting line is very close to a power law with an exponential cutoff (but with a slightly higher powerlaw exponent 1.38).
A direct corollary of maximum entropy principle is that the displacement distribution should follow an exponential form if it only accounts for trips from a single mode of transportation because in that case, c ∝ r. This corollary gets supportive evidences from a number of empirical studies on disparate systems^{28,29,30,31,32,33} (Bazzani et al.^{28} observed a slight deviation from the exponential law). Fig. 4 reports empirical cumulative distributions for taxi trajectories in Beijing^{31}, car trips in Detroit (downloaded from www.semcog.org), bus trips in Shijiazhuang (collected by the authors) and air flights in the US^{30}. The probability density distributions are shown in Fig. S4. All distributions can be well characterized by exponentiallike functions.
Discussion
The general lessons that we learned from the present analysis could be used to refine our knowledge of human mobility patterns. The displacement distributions for aggregated data usually display powerlaw decay with an exponential cutoff. Meanwhile, there are examples ranging from taxi trips to air flights in which the displacement distributions are exponential. In these examples, every displacement distribution is generated by trips involving a single mode of transportation, which corresponds to a linear relation between the travel cost and distance and eventually results in an exponential displacement distribution according to the principle of maximum entropy. In a word, we believe the travel cost is one main reason resulting in the regularities in aggregated statistics. The present results suggest that the form (power law or exponential or other) of deterrence function in the gravity law for human travel^{34} may be sensitive to the modes of transportation under consideration.
This study warns researchers of the risk of inferring individual behavioral patterns directly from aggregated statistics. Analogously, the temporal burstiness of human activities is widely observed and the researchers are aware of the fact that the aggregated scaling laws could either be a combination of a number of individuals, each of whom displays scaling laws similar to the population^{35}, or the result of a mixture of diverse individuals, most of whom exhibit far different statistical patterns than the population^{36,37,38}. In comparison, such issues are less investigated for spatial burstiness. In particular, experimental analyses on individuals has rarely been reported. Determining whether the displacement distribution of an individual follows a powerlaw distribution will require further data and analysis.
It is already known to the scientific community that a number of Poissonian agents with different acting rates can make up a powerlaw interevent time distribution at the aggregated level^{36,37,38} and very recently, Proekt et al.^{39} showed that the aggregated scaling laws on interevent time distribution may be resulted from different time scales. Petrovskill et al.^{19} have applied similar (yet different) idea in explaining the aggregated scaling laws in walking behavior. Although being mathematically and technically different, this work embodies some similar perspectives, because the different transportation modes indeed assign different scales onto space: the world becomes smaller by air flights while a city is really big by walking. Elegant analogy between temporal and spatial human behaviors will benefit the studies of each other.
Many known mechanisms underlie the scaling laws of complex systems^{40,41,42}, including rich get richer^{43,44,45}, good get richer^{46,47}, merging and regeneration^{48}, optimization^{49,50}, Hamiltonian dynamics^{51}, stability constraints^{52} and so on. The individual mobility model by Song et al.^{18} is a typical example embodying the rich get richer mechanism. We have implemented such model. As shown in Fig. S5, the exploration and preferential return model can well reproduce the diversity of individual mobility patterns. In addition, for this model, the Gibbs entropy of the displacement distribution at the individual level increases continuously due to the increasing number of locations as well as links connecting location pairs. However, the exploration and preferential return model does not explain why the lengths of exploration trips should follow a power law, which is a core assumption leading to the powerlawlike aggregated displacement distribution. Therefore, our work has complemented Ref. 18 and other related works in two aspects: (i) providing supportive empirical observation at the individual level; (ii) providing alternative explanation on the emergence of scaling in aggregated displacement distribution. Very recently, from the analysis on mobility patterns in an online game, Szell et al.^{53} observed a characteristic jump length and guessed that the existence of the characteristic length may be due to the single mode of transportation. The present theory could explain their observation since a jump in such online game costs time that is proportional to the jump length.
Methods
Data description
This work was performed using a travel survey data set that contains 230 volunteers' sixweek travel diaries in Frauenfeld, Switzerland^{20}. The survey was conducted among 230 volunteers from 99 households in Frauenfeld and the surrounding areas in Canton Thurgau from August to December 2003. The volunteers reported their daily travel by filling out (paper and pencil based) selfadministrated questionnaire day by day in a sixweek period. Each reported trip includes the information of origin, destination and purpose. The origin and destination of a trip were geocoded by longitude and latitude. The quality of the geocoding is very high  with 60% of trips captured within 100 m of their true origins and destinations and 90% within 500 m. The purpose of trip was classified into work, shopping, education, home, leisure, business and other. The data has been crosschecked to ensure the consistency and filtered to remove outliers as well as unclear and omitted destination addresses. The final cleaned data set includes 36761 trip records. Besides, the data set also contains sociodemographic characteristics of the volunteers' personal information such as age, job and sex.
Kolmogorovsmirnov (KS) test
Given an observed distribution P(x), we firstly assume that it obeys a certain form , with a set of parameters a_{1}, , whose values are estimated by using the maximum likelihood method^{21}. The standard KS distance is defined as the maximal distance between the cumulative density functions of the observed data P^{c}(x) and the fitting curve F^{c}(x), namely . We independently sample a set of data points according to F^{c}(x), such that the number of sampled data points is the same as the number of observed data points and then calculate the maximal distance (denoted by ) between F^{c}(x) and the cumulative density function of the sampled data points. The pvalue is defined as the probability that . In this paper, we always implement 1000 independent runs to estimate the pvalue.
Logarithmic binning
The statistical nature of sampling will lead to the increasing noise in the tails of empirical powerlawtype distributions. Applying the procedure of logarithmic binning^{54} can smooth the noisy tail. Logarithmic binning is a procedure of averaging the data that fall in the specific bins whose size increases exponentially. For each bin the observed value are normalized by dividing by the bin width and the total number of observations (see Fig. 3).
References
González, M. C., Hidalgo, C. A. & Barabási, A. L. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).
Lu, X., Bengtsson, L. & Holme, P. Predictability of population displacement after the 2010 Haiti earthquake. Proc. Natl. Acad. Sci. U.S.A. 109, 11576–11581 (2012).
Jiang, B., Yin, J. & Zhao, S. Characterizing the human mobility pattern in a large street network. Phys. Rev. E 80, 021136 (2009).
Yoon, J., Noble, B. D., Liu, M. & Kim, M. Building realistic mobility models from coarsegrained traces. in Proc. of the ACM MobiSys'06 (Uppsala, Sweden), pp 177–190 (2006).
Vespignani, A. Predicting the behavior of technosocial systems. Science 325, 425–428 (2009).
Barthélemy, M. Spatial networks. Phys. Rep. 499, 1–101 (2011).
Balcan, D. & Vespignani, A. Phase transitions in contagion processes mediated by recurrent mobility patterns. Nat. Phys. 7, 581–586 (2011).
Belik, V., Geisel, T. & Brockmann, D. Natural human mobility patterns and spatial spread of infectious diseases. Phys. Rev. X 1, 011001 (2011).
Ni, S. & Weng, W. Impact of travel patterns on epidemic dynamics in heterogeneous spatial metapopulation networks. Phys. Rev. E 79, 016111 (2009).
Zhao, Z.D., Liu, Y. & Tang, M. Epidemic variability in hierarchical geographical networks with human activity patterns. Chaos 22, 023150 (2012).
Horner, M. W. & O'Kelly, M. E. S. Embedding economies of scale concepts for hub networks design. J. Transp. Geogr. 9, 255–265 (2001).
Um, J., Son, S.W., Lee, S.I., Jeong, W. & Kim, B. J. Scaling laws between population and facility densities. Proc. Natl. Acad. Sci. U.S.A. 106, 14236–14240 (2009).
Zheng, V. M., Zheng, Y., Xie, X. & Yang, Q. Collaborative location and activity recommendations with GPS history data. in Proceedings of the 19th International Conference on World Wide Web (New York, ACM Press), pp 1029–1038 (2010).
Clements, M., Serdyukov, P., de Vries, A. P. & Reinders, M. J. T. Personalised travel recommendation based on location cooccurrence. arXiv:1106.5213.
Scellato, S., Noulas, A. & Mascolo, C. Exploiting place features in link prediction on locationbased social networks. in Proc. of the ACM KDD'11 (New York, ACM Press), pp 1046–1054 (2011).
Brockmann, D., Hufnagel, L. & Geisel, T. The scaling laws of human travel. Nature 439, 462–465 (2006).
Han, X.P., Hao, Q., Wang, B.H. & Zhou, T. Origin of the scaling law in human mobility: hierarchy of traffic systems. Phys. Rev. E 83, 036117 (2011).
Song, C., Koren, T., Wang, P. & Barabási, A. L. Modelling the scaling properties of human mobility. Nat. Phys. 6, 818–823 (2010).
Petrovskii, S., Mashanova, A. & Jansen, V. A. A. Variation in individual walking behavior creates the impression of a Lévy flight. Proc. Natl. Acad. Sci. U.S.A. 108, 8704–8707 (2011).
Chalasani, V. S., Engebretsen, Ø. Denstadli, J. M. & Axhausen, K. W. Precision of geocoded locations and network distance estimates. J. Transport. Stat. 8, 1–15 (2005).
Clauset, A., Shalizi, C. R. & Newman, M. E. J. Powerlaw distributions in empirical data. SIAM Rev. 51, 661–703 (2009).
Song, C., Qu, Z., Blumm, N. & Barabási, A. L. Limits of predictability in human mobility. Science 327, 1018–1021 (2010).
Balescu, R. Equilibrium and Nonequilibrium Statistical Mechanics (New York: John Wiley) (1975).
Willumsen, L. G. Travel networks. in Handbook of Transport Modelling, eds Hensher D. A. & Button K. J. (New York: Pergamon), pp 165–180 (2000).
Rietveld, P., Zwart, B., van Wee, B. & van den Hoorn, T. On the relationship between travel time and travel distance of commuters. Ann. Reg. Sci. 33, 269–287 (1999).
Li, S., Wang, H. & Wang, Z. A study on tour time planning of domestic sightseeing travel itineraries. Hum. Geogr. 20, 51–56 (2005).
Oosterhaven, J. A. & Rietveld, P. Transport costs, location and the economy. in Location and Competition, eds Brakman S. & Garretsen H. (New York: Routledge), pp 32–60 (2005).
Bazzani, A., Giorgini, B., Rambaldi, S., Gallotti, R. & Giovannini, L. Statistical Laws in Urban Mobility from microscopic GPS data in the area of Florence. J. Stat. Mech. P05001 (2010).
Roth, C., Kang, S. M., Batty, M. & Barthélemy, M. Structure of urban movements: polycentric activity and entangled hierarchical flows. PLoS ONE 6, e15923 (2011).
Jiang, B. & Jia, T. Exploring human mobility patterns based on location information of US flights. arXiv:1104.4578v2.
Liang, X., Zheng, X., Lü, W., Zhu, T. & Xu, K. The scaling of human mobility by taxis is exponential. Physica A 391, 2135–2144 (2012).
Gallotti, R., Bazzani, A. & Rambaldi, S. Towards a Statistical physics of human mobility. Int. J. Mod. Phys. C 23, 1250061 (2012).
Peng, C., Jin, X., Wong, K. C., Shi, M. & Liò, P. Collective human mobility pattern from taxi trips in urban area. PLoS ONE 7, e34487 (2012).
Simini, F., González, M. C., Maritan, A. & Barabási, A.L. A universal model for mobility and migration patterns. Nature 484, 96–100 (2012).
Barabási, A. L. The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005).
Malmgrena, R. D., Stouffera, D. B., Motterb, A. E. & Amarala, L. A. N. A Poissonian explanation for heavy tails in email communication. Proc. Natl. Acad. Sci. U.S.A. 105, 18153–18158 (2008).
Hidalgo, C. A. Conditions for the emergence of scaling in the interevent time of uncorrelated and seasonal systems. Physica A 369, 877–883 (2006).
Wu, Y., Zhou, C., Xiao, J., Kurths, J. & Schellnhuber, H. J. Evidence for a bimodal distribution in human communication. Proc. Natl. Acad. Sci. U.S.A. 107, 18803–18808 (2010).
Proekt, A., Banavar, J. R., Maritan, A. & Pfaff, D. W. Scale invariance in the dynamics of spontaneous behavior. Proc. Natl. Acad. Sci. U.S.A. 109, 10564–10569 (2012).
Mitzenmacher, M. A brief history of generative models for power law and lognormal distributions. Internet Math. 1, 226–251 (2004).
Newman, M. E. J. Power laws, Pareto distributions and Zipf's law. Contemp. Phys. 46, 323–351 (2005).
Simkin, M. V. & Roychowdhury, V. P. Reinventing Willis. Phys. Rep. 502, 1–35 (2011).
Simon, H. A. On a class of skew distribution functions. Biometrika 42, 425–440 (1955).
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Lü, L., Zhang, Z.K. & Zhou, T. Deviation of Zipf's and Heaps' Laws in Human Languages with Limited Dictionary Sizes. Sci. Rep. 3, 1082 (2013).
Garlaschelli, D., Capocci, A. & Caldarelli, G. Selforganized network evolution coupled to extremal dynamics. Nat. Phys. 3, 813–817 (2007).
Zhou, T., Medo, M., Cimini, G., Zhang, Z.K. & Zhang, Y.C. Emergence of ScaleFree Leadership Structure in Social Recommender Systems. PLoS ONE 6, e20648 (2011).
Kim, B. J., Trusina, A., Minnhagen, P. & Sneppen, K. Self organized scalefree networks from merging and regeneration. Eur. Phys. J. B 43, 369–372 (2005).
Valverde, S., Cancho, F. & Solé, R. V. Scalefree networks from optimal design. Eur. Phys. Lett. 43, 369–372 (2002).
Bartumeus, F., Da Luz, M. G. E., Viswanathan, G. M. & Catalan, J. Animal search strategies: a quantitative randomwalk analysis. Ecology 86, 3078–3087 (2005).
Baiesi, M. & Manna, S. Scalefree networks from a Hamiltonian dynamics. Phys. Rev. E 68, 047103 (2003).
Perotti, J. I., Billoni, O. V., Tamarit, F. A., Chialvo, D. R. & Cannas, S. A. Emergent selforganized complex network topology out of stability constraints. Phys. Rev. Lett. 103, 108701 (2009).
Szell, M., Sinatra, R., Petri, G., Thurner, S. & Latora, V. Understanding mobility in a social petri dish. Sci. Rep. 2, 457 (2012).
Milojević, S. Power law distributions in information science: Making the case for logarithmic binning. J. Am. Soc. Inf. Sci. Tec. 61, 2417–2425 (2010).
Acknowledgements
This work was partially supported by the National Natural Science Foundation of China (NNSFC) under Grant Nos. 11222543, 11205040, 11275186 and 91024026, Program for New Century Excellent Talents in University under Grant No. NCET110070 and Huawei UniversityEnterprise Cooperation Project under Grant No. YBCB2011057.
Author information
Authors and Affiliations
Contributions
X.Y.Y., X.P.H., B.H.W. and T.Z. designed the research. X.Y.Y. performed the experiments, X.Y.Y., X.P.H. and T.Z. analysed the data, X.Y.Y. and T.Z. wrote the manuscript.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Electronic supplementary material
Supplementary Information
Supporting Information
Rights and permissions
This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/
About this article
Cite this article
Yan, XY., Han, XP., Wang, BH. et al. Diversity of individual mobility patterns and emergence of aggregated scaling laws. Sci Rep 3, 2678 (2013). https://doi.org/10.1038/srep02678
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/srep02678
Further reading

What are the multimodal patterns of individual mobility at the day level in the Paris region? A twostage datadriven approach based on the 2018 Household Travel Survey
Transportation (2022)

A universal opportunity model for human mobility
Scientific Reports (2020)

Estimation of Regional Economic Development Indicator from Transportation Network Analytics
Scientific Reports (2020)

Destination choice game: A spatial interaction theory on human mobility
Scientific Reports (2019)

Universal knowledge discovery from big data using combined dualcycle
International Journal of Machine Learning and Cybernetics (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.