Strategies and limitations in app usage and human mobility

Cognition has been found to constrain several aspects of human behaviour, such as the number of friends and the number of favourite places a person keeps stable over time. This limitation has been empirically defined in the physical and social spaces. But do people exhibit similar constraints in the digital space? We address this question through the analysis of pseudonymised mobility and mobile application (app) usage data of 400,000 individuals in a European country for six months. Despite the enormous heterogeneity of apps usage, we find that individuals exhibit a conserved capacity that limits the number of applications they regularly use. Moreover, we find that this capacity steadily decreases with age, as does the capacity in the physical space but with more complex dynamics. Even though people might have the same capacity, applications get added and removed over time. In this respect, we identify two profiles of individuals: app keepers and explorers, which differ in their stable (keepers) vs exploratory (explorers) behaviour regarding their use of mobile applications. Finally, we show that the capacity of applications predicts mobility capacity and vice-versa. By contrast, the behaviour of keepers and explorers may considerably vary across the two domains. Our empirical findings provide an intriguing picture linking human behaviour in the physical and digital worlds which bridges research studies from Computer Science, Social Physics and Computational Social Sciences.

Algorithm 1: Algorithm for extracting the stop events from GPS sequences.
Input: Time-ordered list of a user's raw GPS positions R = [r 0 , r 1 , . . . , r n ], their time T = [t 0 , t 1 , . . . , t n ], a spatial threshold ∆s and a temporal threshold ∆t. Output: The set S of a user's stop events. left = 0; S ← ∅ ; while left < n do right ← minimum j such that t j ≥ t left + ∆t; if Diameter(R, left, j) > ∆s; then left ← left +1; end else right ← maximum j such that j ≤ n and Diameter(R, left, j) < ∆s; S ← S∪ (Medoid(R, left, right), t left , t right ) ; left ← right +1; end end • We reduce the number of points that are most likely not part of a stop event.
Thus, we filter out ∀r i | d(r i−1 , r i ) < 10m∧ | d(r i , r i+1 ) < 10m, but also those ∀r j | d(r j−1 , r j ) > ∆s∧ | d(r j , r j+1 ) > ∆s. Although simple, this heuristics keep the complexity on average around O(n) and in the worst case O(n 2 ).
The Diameter algorithm can be further optimised by converting all coordinates to a Cartesian plane, then finding the smallest convex region containing all the points and finally computing the diameter in linear time between the points of the convex hull. However, in this work we choose to have higher accuracy using the original coordinates and defining d(i, j) as the Haversine great-circle distance between i and j. Given the average radius of the Earth r and two points with latitude and longitude ϕ 1 , ϕ 2 and λ 1 , λ 2 respectively, the Haversine distance d between them is: d = 2r arcsin sin 2 ϕ 2 − ϕ 1 2 + cos(ϕ 1 ) cos(ϕ 2 ) sin 2 λ 2 − λ 1 2 The Haversine distance does not require to project points to a plane, and it is more accurate both in short and long distances.

Stop locations
For each user, we define stop locations as the sequences of stop events that can be considered part of the same place. For example: if user A goes many times at the Colosseum in Rome, she could have many stop events (e.g., northern entrance, southern entrance) that can be grouped in a unique stop location (i.e. the Colosseum). To determine a stop location from stop events we use the DB-scan [2] algorithm that groups points within = ∆s − 5 meters of distance to form a cluster with at least minP oints = 1 event. The complexity of DB-scan is O(n). We horizontally scale the computation through different cloud machines thanks to Apache Spark. Taking as a reference previous work [1,7,3] we choose ∆s = 50 meters and ∆t = 15 minutes. We qualitatively noticed that with ∆s = 30 (same as the error threshold for our data filtering) the stop locations are more noisy. Similarly, ∆t = 10 minutes may form some spurious stop locations.
We select = ∆s − 5 meters to avoid the creation of an extremely -and incorrect-long chain of sequential stop events. Thus, = 45 meters. However, stop events and stop locations may be very sensible to the ∆s and ∆t parameters. Therefore, we repeated our experiments both with ∆s = 60 and ∆t = 10 and we found no significant differences. For this reason, in the next Sections we align our discussion to the existing literature and use ∆s = 50 meters and ∆t = 15 minutes.

From applications to mobility
We investigated the relationship between mobile app usage behaviour and mobility by correlating the capacity, activity, and strategy between app usage and mobility. However, temporally aggregated behaviour might hide choices people make at a smaller time scale. Thus, we break down people's behaviour on a daily, weekly and monthly basis and • Individual behavior: if an individual spends exceptionally more time than his/her average or baseline on mobile apps, does his/her mobility suffer? Defined as: For a set of pairs (i, j) at time t, the Kendall rank coefficient measures how much the rank of the pair changed from t to t + 1. The coefficient is 1 when the ranks are identical, while it is -1 when they are dissimilar. In other words, we expect the Kendall's τ to be positive and high when application usage is very similar to mobility, while we expect it to be negative in the presence of a trade-off between the two domains. Similarly, we also test this trade-off in the frequency domain, comparing the number of apps launched and the number of visited locations. Thus, we compare the app usage and mobility dynamics and look for any trade-off or positive correlation between these two domains. A strong negative correlation between the two domains echoes previous studies linking smart-phone addiction to negative outcomes such as obesity [6], while a strong positive correlation mean people use phones especially when they move, or with a scale-free dynamic. Table S1 summarises the results of such an analysis. As depicted in the Table, we do not find any strong negative correlation between these variables, which would represent the existence of a trade-off between mobile phone usage and human mobility. On the contrary, in the frequency domain we do find a slight positive correlation in the Average and Raw behaviours. In other words, when people launches apps more than what other people do, they increase also the mobility.
In summary, in this manuscript we find that capacity is positively correlated between the two domains, but users might adopt different strategies in each domain. Empirical results have shown that intense use of the phone does not necessarily predict well-being [4]. Similarly, our results on the trade-off suggest that people, on average, do not decrease (increase) their physical mobility (as measured by the time spent in visited places) because of the high (low) phone usage. We only found a slight positive correlation on the Average and Raw behaviours, which might be a consequence of the intense phone usage during commuting [8]. However, the lack of agreement between the Raw, Average and Individual's average might suggest that these two domains reflect different aspects of human behaviour. We leave the investigation of this hypothesis to future work.