Quantifying Human Engagement into Playful Activities

Engaging in playful activities, such as playing a musical instrument, learning a language, or performing sports, is a fundamental aspect of human life. We present a quantitative empirical analysis of the engagement dynamics into playful activities. We do so by analyzing the behavior of millions of players of casual video games and discover a scaling law governing the engagement dynamics. This power-law behavior is indicative of a multiplicative (i.e., “happy- get-happier”) mechanism of engagement characterized by a set of critical exponents. We also find, depending on the critical exponents, that there is a phase transition between the standard case where all individuals eventually quit the activity and another phase where a finite fraction of individuals never abandon the activity. The behavior that we have uncovered in this work might not be restricted only to human interaction with videogames. Instead, we believe it reflects a more general and profound behavior of how humans become engaged in challenging activities with intrinsic rewards.

Humans are deeply captivated to try new experiences that eventually become pleasant daily routines. The enjoyment of playing a musical instrument, speaking foreign languages, sports or hobbies, are all activities that for full enjoyment require some time investment and training experience that eventually pay off. One interesting question is how humans get engaged and come to love these activities, which offer both a challenge as well as an intrinsic reward. What is the training or learning process and how does it affect their level of enjoyment? How can we measure and quantify fun?
Before the new era of modern technology, answering this type of question relied on the accumulated knowledge obtained from qualitative observations of single individuals made in different conditions by different observers. This makes it very difficult to extract general laws of human behavior. The widespread use of the Internet and the world-wide connectivity that it provides is changing this picture radically and fast. For the first time in human history, it is possible to monitor human actions on an unprecedented large-scale, allowing us to uncover precise and quantitative laws of human behavior [1][2][3] . Nowadays, we have the ability to measure, with impressive precision, our mobility patterns 4,5 , our musical tastes 6,7 , or the way in which ideas spread and crystallize across populations 8,9 , providing us with a very accurate picture of some of the key aspects of human behavior at the large scale [10][11][12][13] .
Fostered by the widespread outburst of smart phones and tablets, one of the most popular current amusements are casual video games. These are games with simple rules and game dynamics that can be played in brief bursts in a casual way, e.g. during breaks or daily commuting. Some of these games, like Candy Crush Saga (the flagship game of King Digital Entertainment), have reached outstanding popularity. As of the fourth quarter of 2018, King's games were played by 268 millions monthly active players, with millions of players playing many millions of levels every day in Candy Crush Saga alone 14 . Hence, they are an ideal platform for studying how humans become engaged in a rewarding activity.
There is a vast literature on measuring video game engagement and enjoyment. However, most of these studies are based on (1) surveys with a moderate number of individuals 15-18 , (2) physical measures of physiological metrics on players while they are playing 19 , or (3) studies of psychological motivations 18,[20][21][22] . In this paper, it is not our intention to enter into the psychological, motivational, behavioral, or social aspects of video game playing nor criticize the standard psychometric, behavioral or physiological metrics, or questionnaire-based evaluation of engagement performed on a limited number of individuals (typically aware to be subject of study) and short time span. Our work is radically different as it approaches the problem from a data-driven point of view by analyzing the real behavior of a large population of individuals as they play the game. In some of the games we have analyzed, we follow the individual behavior of a cohort of 10 million players during a period of two years. This astonishing amount of data allows us to quantify empirically users' engagement vs progression in a way that has not been possible before the big data era. Besides, our analysis reveals a scaling law that is universal -across many different games, player segmentation, or countries-with profound theoretical implications.
Specifically, we show that the progression, engagement, and quitting of players in casual games can be analyzed and simulated using a simple stochastic model. The level of enjoyment and engagement of a fun activity, like a video game, can be measured and shows a common scaling behavior described by a power-law as a function of the progression in the game. This result suggests that enjoyment, like popularity, wealth, and many other phenomena, is a multiplicative process [23][24][25][26] : the more you are into it, the more engaged you become. Our empirical findings have interesting implications not only for casual games but also for generic engagement dynamics into a variety of different activities, reflecting a global trend of human behavior.

Results
Casual games. Many typical casual games, like Candy Crush Saga, pose a linear sequence of levels that a player can access one by one as the previous level is successfully completed (see Fig. 1). Players start the game at level 1 and progress level by level in an increasing manner. At each level, the player must achieve a predefined goal to pass it (e.g. collect a specific number of candies or reach a certain score) using a limited number of moves, resources, or time. Each attempt to pass a level can be successful, meaning that the player passes that level and can play the next one, or unsuccessful. Alternatively, the player can become tired or frustrated at some point and decides to quit the game. Each level always involves randomness, either in the initial configuration or in the dynamics. This makes it natural to model game dynamics as a stochastic process 27,28 .
To model player progression and experience in the game, we use two general indicators: one to quantify the total time spent in the game and another one to measure the progression within the game. In casual games, the real-time activity (i.e. how often and for how long the person plays the game) is not a good measure of the actual time spent in the game. This is so because these games are very often played in short breaks or free time, which is unpredictable and not controlled by the player. Instead, we use the accumulated number of attempts as activity-independent measure of the total "time" spent in the game. The maximum level achieved after a given number of attempts is an indicator of game progression, i.e., on how far a player is in the game (see Fig. 1b). With this strategy, we monitor the actual progression of players in the game decoupled from their real-world activity.
For these games, the dynamics of game progression can be modeled in a very simple way using Continuous Time Random Walks (CTRW) 29,30 , as described in detail in the Supplementary Information (SI). In our model, we assume that all players can be considered as identical and independent. When a player reaches a new level, there are two competing random processes taking place simultaneously: (1) the random number of attempts required to pass that level, τ p , and (2) the random time, measured in number of attempts, that the player takes to get bored or frustrated and decides to abandon the game, τ a . For a given level, the final fate of the player depends on which of these random times is shorter. If τ p < τ a , the player passes the level and jumps to the next one; otherwise the player quits the game. These two times are assumed to be statistically independent random variables with probability density functions t ( ) www.nature.com/scientificreports www.nature.com/scientificreports/ is just the average number of attempts needed by players that passed level n to pass it. The churn probability p c (n) is the total number of players that abandoned at level n divided by the total number of players that played level n. We consider that a player has abandoned the game when the player shows no activity during the remaining of the observation time window. Consequently, the estimation of p c , and so of t a , depends on the observation time window of the dataset. In the SI file, we report estimations of t a for the same cohort of players by increasing the total observation window from 60 to 600 days. The abandon time nicely collapses into a clean power law as we increase the size of the window.
Measuring engagement. Figure 2 shows an example of the average abandon and pass times of the different levels of a game, along with the behavior of the survival probability of players in the game as a function of the level and attempts, respectively. The data corresponds to a week cohort of 11,836,502 players of Candy Crush Saga game playing on the Facebook platform starting on 2014 and followed for 2 years.
The empirical data reveal a very interesting behavior for the abandon time, t n ( ) a . After an initial number of levels, typically 10-20, where the player is discovering the game (or the activity) and deciding whether he/she likes it or not, the engagement follows a power-law behavior of the form t n n ( ) a~α , with an exponent α around 1.1 (The value of the exponent for most analyzed games is in the range [1.0, 1.5]. In short datasets there is small plateau for high levels, consequence of the finite time window of observation and the end of content effect (see Fig. S5)). As a consequence of such fast growth rate, players behave very differently depending on their progression throughout the game, suggesting a "happy-get-happier" mechanism as a final explanation. The average pass time, on the other hand, is an indicator of the relative difficulty of the level as perceived by a player that has reached level n by his/her own means. Therefore, t n ( ) p is a combination of the intrinsic difficulty of the level and the learning curve of players 32 and, in general, we expect it to show a convex dependency on the progression level n. Consider, for instance, the case of learning a musical instrument. It is clear that the Minuet in G (BWV 114) from the Notebook of Anna Magdalena Bach is objectively simpler than the Bach-Brahms Chaconne in D minor BWV 1004 (for the left hand alone). Yet, the effort to learn the former (and so to advance in the progression) is perceived by a first-year piano student as higher than the effort to learn the latter as perceived by, for instance, the great piano player Daniil Trifonov. We thus expect t n ( ) p to grow with n in a convex way. Our empirical analysis indicates that, indeed, this is the case. As a matter of fact, in the studied datasets, the average pass time after the first 10-20 tutorial levels can be reasonably fitted by a power law t n n ( ) p~β , with an exponent β in the range [0.1, 0.5]. www.nature.com/scientificreports www.nature.com/scientificreports/ These scaling laws have important consequences for the global dynamics of the game. Indeed, as we show in the SI, there is an (infinite order) phase transition as a function of the parameters α and β between a standard phase, where all players eventually quit the game, and an "enthusiastic" phase, where a finite fraction of players never abandon the game. For α − β < 1, the probability of a player quitting the game at level n or higher follows a Weibull distribution of parameter β − α + 1, that is, S n e ( ) In this standard phase, the probability of a player to never abandon the game is zero. Instead, when α − β > 1 there is a finite probability that players never abandon the game, provided that the game has infinite content. This probability can be computed as ≈ (see SI for a formal proof). In this "enthusiastic" phase, the survival probability for those players that eventually do abandon the game follows a power law of the form S f (n) ~ n 1+β−α . This implies that the higher the value of α − β the fastest S f (n) decays, so that either players abandon the game at the beginning of the progression, or they keep playing forever. Interestingly, all the analyzed casual games seem to be below but very close to the critical point α = 1 + β so that the survival probability is well described by a Weibull distribution.  Figure 3b shows the survival probability for finite realizations S f (n) below, at, and above the critical point α c . The agreement with the theoretical predictions is remarkable.
Mimicking player progression by simulation. In our model, we make three main assumptions: (i) the independence of the average pass and abandon times; (ii) both times are exponentially distributed; and (iii) all players can be considered as statistically identical. To verify the validity of these assumptions, (i) we performed a detrended fluctuation analysis that verifies that both times are truly independent (see Fig. S2); (ii) we have also verified that the distribution of abandon and pass times of all levels are exponential to a very good approximation (see Fig. S3); (iii) we also show that considering all players as identical reproduces their progression and survival accurately. To contrast the validity of this last assumption and of the model, we simulated the progression and churn of a cohort of identical players using the simple stochastic algorithm described in the Methods section with the abandon and pass times measured for the real dataset as input. Figure 2b compares the real data with the results of the simulations for the survival probability in levels and attempts (i.e. the fraction of initial players still active after playing a given number of attempts or levels). The simulations nicely reproduce the real survival (except for the small finite size effects of the tail), showing impressively the validity of the model and of the assumption that all players can be considered as identical. Accordingly, the abandon time is indeed an intrinsic, difficulty-independent measure of the average engagement of players at that level. Hence, a remarkable aspect of the model is that it can measure quantitatively human engagement and how it evolves as players progress in the game.
Universal behavior. The data for the abandon time shown in Fig. 2 for a specific game (Candy Crush Saga), clearly shows that the engagement increase as a power-law as the player gets more into the game. We repeated the analysis for different Saga games: Farm Heroes, Papa Pear, Candy Crush Soda and Pyramid Solitaire (see Fig. 4). These games are very different in terms of genre (e.g. Candy Crush is a match-three swapping tile game; Papa Pear is a physics based bouncing game; Pyramid is a card solitaire), targeted audience, graphics, mechanics and design. Astonishingly, all of them exhibit a common power law behavior of engagement, showing that this evolution of the engagement into a fun activity may be universal. The same happens when we analyzed data corresponding to players from different continents, platforms and periods of time (see Fig. S4).

Discussion
We have seen that it is possible to quantify and model progression and churn of a playful activity or habit, like a videogame, as a competition between two ingredients: relative difficulty and engagement. Our big data analysis of the system allowed us to find a very precise measure of engagement, which shows a power-law trend indicative of a happy-get-happier mechanism. In this work, we have focused on the particular case of engagement in videogames since, to the best of our knowledge, it is the only system where the amount of available data allows us to elucidate sound statistical laws. However, we believe the process can be generalized to describe engagement in other activities: difficulty is a measure of the training cost and engagement is a measure of the reward or tolerance. Our model shows that a delicate balance between these two ingredients is needed to avoid early churn and that having a very difficult/traumatic experience at the initial stages would lead to massive churn. In addition, there is an interesting phase transition controlled by the ratio of progression between difficulty and engagement that leads to a finite probability that the person never abandons the activity. An interesting example is learning to play a musical instrument and, in general, any rewarding intellectual activity, like doing scientific research or artistic creation. Our model predicts a phase where the probability of individuals to never abandon the activity is non-zero. This may seem as obvious in these cases. Indeed, after many years of intense training, it is very unlikely that a person who had reached an advanced level would stop playing the piano or doing research 33 . Certainly, the amount of content in such disciplines is, basically, unlimited and the intellectual reward of keeping doing them is so high that it would be highly improbable that anyone at an advanced level would quit the activity. The importance of our framework relies precisely in its ability to explain when this behavior is possible and under what precise conditions. The model could be helpful to perform a similar analysis in other fields, to quantify tolerance and enjoyment and to design smooth learning procedures to facilitate for instance healthy habits (like sports) or to minimize early school leaving.

Methods
Empirical estimation of average abandon and pass times of individual levels. In our model, we assume that pass and abandon times are statistically independent random variables exponentially distributed according to Eq. (1). For mathematical tractability we take t as a continuous variable. This assumption does not affect any of the conclusions of this work. The corresponding survival probabilities, representing the probability that the time required to pass or abandon at level n is larger than t are: The average abandon, t n ( ) a , and pass, t n ( ) p , times cannot be measured directly from the data. The reason is that abandon and pass times are unconditioned random processes, that is, t ( ) n p ψ , for instance, accounts for the distribution of pass times at level n if players were not allowed to quit the game, which is a condition that is not meet in