Destination choice game: A spatial interaction theory on human mobility

With remarkable significance in migration prediction, global disease mitigation, urban planning and many others, an arresting challenge is to predict human mobility fluxes between any two locations. A number of methods have been proposed against the above challenge, including the gravity model, the intervening opportunity model, the radiation model, the population-weighted opportunity model, and so on. Despite their theoretical elegance, all models ignored an intuitive and important ingredient in individual decision about where to go, that is, the possible congestion on the way and the possible crowding in the destination. Here we propose a microscopic mechanism underlying mobility decisions, named destination choice game (DCG), which takes into account the crowding effects resulted from spatial interactions among individuals. In comparison with the state-of-the-art models, the present one shows more accurate prediction on mobility fluxes across wide scales from intracity trips to intercity travels, and further to internal migrations. The well-known gravity model is proved to be the equilibrium solution of a degenerated DCG neglecting the crowding effects in the destinations.


S1 Additional validation of the DCG model
We use two types of data, namely, intercity travels and intracity trips, to validate the DCG model. Description of these data sets is given below: (1) Intercity travels. The data for intercity travels in Japan, U. K. and Belgium are extracted from the Gowalla check-in data set [S1] (https://snap.stanford.edu/data/locgowalla.html). Gowalla is a location-based social networking website on which users share their locations when checking in. The data set includes 6,442,890 check-ins of users over the period Feb. 2009-Oct. 2010. For this data set, we define a user's travel as two consecutive check-ins in different cities.
(2) Intracity trips. The records of intracity trips in New York and Los Angeles are extracted from the Foursquare check-in data set [S2], which contains 73,171 users. We define a user's trip as two consecutive check-ins at different locations (here, the locations are defined as the 2010 census blocks; see https://www.census.gov/geo/mapsdata/maps/block/2010/). The total number of trips is 182,033. The data for intracity trips in Oslo, Norway is extracted from the Gowalla check-in data set [S1]. Because of the absence of census blocks and traffic analysis zones in Oslo, we simply partition the city into 88 equal-area square zones, each of which is about 1 km × 1 km. Each zone is one location in the city.
The estimated model parameters for these data sets are shown in table S1, and the prediction results are shown in Figs S1-S4. Analogous to the results shown in the main text, DCG well predicts the real fluxes, with higher accuracy than other benchmarks subject to the SSI.
We further compare the predictions of the DCG model with other benchmark models in terms of travel distance distribution P (d) and destination attraction distribution P (D).
The prediction results are shown in Figs S5-S6. In order to quantitatively compare the prediction accuracy of different models, we perform the two-sample Kolmogorov-Smirnov (KS) test [S3] on the model predicted and observed P (d) and P (D). The results are shown in tables S2-S3, from which we can see that the KS statistics of the DCG model are generally smaller than or closer to that of the gravity model, meaning that the DCG model has relatively high prediction accuracy.

S2 Derivation of the gravity model using potential game theory
Potential game theory originated from the congestion game presented by Rosenthal [S4].
Monderer and Shapley defined exact potential games [S5] in which information concerning the Nash equilibrium can be incorporated in a potential function. They showed that every exact potential game is isomorphic to a congestion game. In the congestion game model, each player chooses a subset of resources. The benefit associated with each resource is a function of the number of players choosing it. The payoff to a player is the sum of the benefits associated with each resource in his strategy choice. A Nash equilibrium is a selection of strategies for all players such that no players can increase their payoffs by changing their strategies individually. Strategy profiles maximizing the potential function are the Nash equilibria [S6].
From the introduction of the congestion game we can see that the degenerated destination choice game (DDCG) neglecting the crowding effect in the destination is a typical congestion game. Below we will give the process for finding the Nash equilibrium solution of the DDCG by maximizing the potential function of the congestion game.
A congestion game is a tuple (N, R, (Ψ k ) k∈N , (w j ) j∈R ) [S7], where N = {1, . . . , n} is a set of players (for DDCG, it is the set of O i travellers starting from origin i), R = {1, . . . , m} is a set of resources (for DDCG, it is the set of destinations), Ψ k ⊆ 2 R is the strategy space of player k (for DDCG, each player can only choose one destination in a strategy), and w j is a benefit function associated with resource j (for DDCG, it is the utility function, say w j = U ij ). Notice that benefit functions can achieve negative values, representing costs of using resources [S6]. S = S 1 , . . . , S n is a state of the game in which player k chooses strategy S k ∈ Ψ k . For a state S, the congestion n j (S) on resource j is the number of players choosing j. For DDCG the number of travellers choosing destination j is T ij (S). The congestion game is an exact potential game [S5], in which the potential function is defined as For the DDCG with utility function U ij = α ln A j − β ln d ij − ln T ij , the potential function is where A j is the attractiveness of location j, d ij is the geometric distance between i and j, and α and β are nonnegative parameters. To find the Nash equilibrium solution of DDCG, we treat T ij as a continuous variable. Then, Eq. (S2) can be rewritten as For the optimization problem in which max ϕ(S) is subjected to ∑ j T ij = O i , we can use the Lagrange multiplier method to obtain the solution. The Lagrangian expression is where λ is a Lagrange multiplier. The partial derivative of the Lagrangian expression with respect to T ij is therefore Another partial derivative is By combining Eq. (S9) and Eq. (S6) we can derive which happens to be an origin-constrained gravity model with two free parameters. If we set α = 1, the solution becomes which is the standard origin-constrained gravity model [S8]. Now back to the DCG model that considers both the congestion on the way and the crowding in the destination. Its utility function is If the crowding cost ln D j is not affected by the fluxes T ij , the maximization of the However, in fact, the destination attraction D j = ∑ i T ij is dependent on the fluxes T ij , resulting in the essential difficulty in solving the Nash equilibrium of DCG . Therefore, we use the iterative algorithm MSA (see Material and Methods in the main text) to numerically solve the DCG model. In the MSA iteration, the function to calculate the where λ i and µ j are Lagrange multipliers. The partial derivative of the Lagrangian therefore Since and we can get Let a i = e λ i /O i and b j = e µ j /D j , Eq. (S15) can be rewritten as which is the standard doubly-constrained gravity model [S8]. In the actual calculation, a i and b j are two sets of interdependent balancing factors, i.e.
This means that the calculation of one set requires the values of the other set: start with all b j = 1, solve for repeat until convergence of the two sets is achieved [S8].

S3.1 Maximum entropy approach
The earliest gravity model for spatial interaction was developed by analogy with Newton's law of universal gravitation but lacked a rigorous theoretical base. Wilson proposed a maximum entropy approach to deriving the gravity model by maximizing the entropy of a trip distribution [S9] max ln Ω = ln where Ω is the number of distinct trip arrangements of individuals, T is the total number of trips, T ij is the number of trips from location i to location j, O i is the total number of departures from i, D j is the total number of arrivals at j, C ij is the travelling cost from i to j and C is the total travelling cost.
According to the maximum entropy principle, the most likely trip distribution is the distribution with the largest number of microscopic states. Using the Lagrange multiplier method to solve Eq. (S21), we can get where d ij is the distance between i and j, we can get a doubly-constrained gravity model with power distance function Wilson's maximum entropy derivation offers a theoretical base for the gravity model.
However, the maximum entropy principle in statistical physics can only give the most likely macrostate (i.e., the most likely trip distribution matrix T) but cannot describe the individuals' decision processes (i.e., the microscopic mechanism) in the system [S10].
Meanwhile, the total cost in the maximum entropy method is not causally bounded by the theory itself, but determined externally [S11]. As the so-called total cost cannot be estimated in real world, the maximum entropy theory is less practical.

S3.2 Deterministic utility theory
Some scientists described the micro decision-making process of individual spatial interaction (destination choice) using the principle of utility maximization in economics [S10].
Earlier studies used deterministic utility theory to derive the gravity model. The derivation is given in terms of trips made by individuals from a single origin to many destinations [S12]. For an individual k at origin i, assume that there are αm j persons or things at each destination j with which the individual at i would like to interact per trip, where m j is the population at j and α is a parameter. Then, k's utility of tripmaking from i to all destinations is where U An individual's number of trips is constrained by the total cost that the individual can pay, where r is the cost per unit distance travelled and C (k) i is the total amount of money individual k located at i is willing to spend on travels per unit time.
ij and using the Lagrange multiplier method to maximize Eq. (S24) under constraint Eq. (S25), we can derive The total number of trips taken by all individuals from i to j is obtained by summing the trips from i to j taken by all individuals at i: where C i is the total amount of money that all individuals at origin i are willing to spend on travels per unit time.
The main problem of this deterministic approach is that the total budget needs to be determined in advance. This is similar to the problem of Wilson's maximum entropy approach, which requires the prior constraint of the total cost. In addition, this method describes the individual's destination selection process over a continuous time period (i.e., the unit time). If the period is short enough and individuals can only complete one trip, then the individuals at a given origin will all select the same destination with the maximum utility, and there will be no dispersion of trips [S10].

S3.3 Random utility theory
Domencich and McFadden applied the random utility theory to many transport-related discrete choice problems [S13], including trip destination choice. In this method, the random utility U ij of a destination j for the individuals starting from origin i is defined as where V ij is a nonstochastic element reflecting the observed attributes of i and j, and ε ij is a random variable describing an unobserved element containing attributes of the alternatives and characteristics of the individual that we are unable to measure.
The individual will choose the destination j that maximized his utility, say where J is the set of all candidate destinations.
Since these utility values are stochastic, the choice probability of destination j for any individual at i is given by If the random variables ε ij are independently and identically distributed Gumbel random variables, i.e., then, from Eq. (S30), we can get [S14] Noting that V ij − V ij = 0, so Eq. (S32) can be written as Define t = e −x such that dt = −e −x dx. When x = ∞, t = e −∞ = 0 and when x = −∞, y = e ∞ = ∞. Therefore, Eq. (S33) can be written as which is the Logit model usually used in transport modal choice [S8]. If we set V ij = ln m j − β ln d ij , we can get an origin-constrained gravity model Random utility theory accounts for the dispersion of trips from an origin and does not require a predetermined total budget. Therefore, the gravity model based on random utility theory seems superior to other approaches based on deterministic utility theory or maximum entropy theory [S10]. However, random utility theory asks for an oversubtle condition, namely the existence of an unobserved variable ε ij that has to obey the independent and identical Gumbel distribution.