Developmental asymmetries in learning to adjust to cooperative and uncooperative environments

Learning to successfully navigate social environments is a critical developmental goal, predictive of long-term wellbeing. However, little is known about how people learn to adjust to different social environments, and how this behaviour emerges across development. Here, we use a series of economic games to assess how children, adolescents, and young adults learn to adjust to social environments that differ in their level of cooperation (i.e., trust and coordination). Our results show an asymmetric developmental pattern: adjustment requiring uncooperative behaviour remains constant across adolescence, but adjustment requiring cooperative behaviour improves markedly across adolescence. Behavioural and computational analyses reveal that age-related differences in this social learning are shaped by age-related differences in the degree of inequality aversion and in the updating of beliefs about others. Our findings point to early adolescence as a phase of rapid change in cooperative behaviours, and highlight this as a key developmental window for interventions promoting well-adjusted social behaviour.


Prior expectations and social preferences: IQ, and sex effects
In a series of robust regression analyses (5000 bootstraps) we examined the effect of IQ and sex effects on participants' social preferences (indifference points) and prior expectations across all ages.
Compared to boys, girls also expected that others would choose to have more than the participant (Sex, B = 0.457, β = 0.135, P = 0.035, 95% CI [0.033 -0.881]). We observed no sex differences in disadvantageous inequality aversion and prior expectations of others' trust behaviour (all Ps > 0.05).
Next, we assessed correlations between advantageous and disadvantageous inequality aversion, and between the prior expectations of the Trust Game and Coordination Game. Inequality aversions were slightly correlated, indicating that people who disliked being behind also liked being ahead (r = 0.176, P = 0.012; N = 202). This relation was significant even when controlled for age (r = 0.151, P = 0.032). Prior expectations about others in the Trust Game and the Coordination Game were not correlated (r = 0.069, P = 0.282; N = 245), neither when controlled for age (r = 0.067, P = 0.297).

Developmental changes in the non-social learning task
To assess whether people are able to adjust their decision-making in a non-social context, we included a non-social learning task with two environments of computer opponents (payoff matrix   [ ], see Figure S1a). Similar to the economic games, participants could maximise their payoffs by coordinating their choices to the choices of the computer opponent. That is, choosing option A when playing against the environment that most often (11/15 trials) chooses X, and choosing option B when playing against the environment that most often (11/15 trials) chooses Y.
We fitted a logistic GLMM to decisions (coded as correct/incorrect) in the non-social learning task, with age linear and age quadratic as predictors. Performance increased linearly with age ( Figure   S1b; see Table S5 for all statistics). Figure S1. Non-social learning task. a, layout of the learning task. b, performance per age cohort. c, performance over trials pooled across all participants (N=245). Error bars in panel b and shaded areas in panel c show standard error of the mean (s.e.m).

Age-related differences in the non-social vs. social learning game
The main focus of this study was to compare the influence of prior beliefs, social preferences, and feedback-based updating across age groups in different social learning environments. However, the inclusion of a non-social task lends itself for the opportunity to compare learning performance in a social and non-social task more generally. Exploratively, we therefore compared performance in the non-social learning task versus the Trust Game and the non-social task versus the Coordination Game.
We fitted a binomial GLMM to decisions (coded as correct/incorrect) with age linear * type of task (social vs non-social) as predictors. Participant was included as a random intercept to account for the repeated nature of the choice data. Overall performance (grouped across environments) was lower in the Trust Game compared to the non-social learning task; main effect Task P <0.001; Trust Game Mean accuracy = 0.613 (SD = 0.487); Non-social Mean accuracy = 0.699 (SD = 0.459). Follow-up tests, showed that this advantage for learning in the non-social task was present for each age group

Testing confounding effects of sex and IQ on choice behaviour in the social-learning games
To assess possible effects of sex and IQ on choice behaviour in the learning tasks, we ran additional there was a main effect of IQ (P < 0.001) indicating that participants with a higher IQ had better performance in the non-social learning task. These results suggest that IQ and sex differences did not influence performance on the social learning tasks, and did not influence any of the observed agerelated changes in adjustment behaviour. However, IQ did relate positively to a general learning tendency in the non-social learning task.

Calculation of social preferences Dictator Game
The Dictator Game (DG) was used to estimate people's advantageous inequality aversion. The aim of the DG is to identify the point at which a participant is indifferent between (10, 0) and the equal distribution. To this end, we always had participants initially chose in the first trial between (10, 0) and (5, 5). The number of points in the subsequent trials depended on the choice in that first trial.
For participants who switched multiple times from equal to unequal distribution (i.e., who did not show a consistent choice pattern), we fitted a softmax function (see below) to their DG choices to approximate their indifference point. In particular, we coded a choice for the unequal distribution as 0 and the equal distribution (x, x) as 1. To these choices, 6 for each participant, we then fitted a softmax function y=[1+exp(-Z)] -1 , where Z = a + b*x. Subsequently, we solved for y=0.5  Trial sequence depends on the choice in the first trial: upper sequence when first choice is for the unequal distribution, lower sequence when first choice is for the unequal distribution.
In the responder stage, participants were told they were now paired with a new player and that they could accept or reject their proposals ( Figure S). As a responder, participants responded to 6 proposals. The first proposal was an equal split but every next proposal was more beneficial for the other than for self (i.e., (5, 5), (4, 6), (3, 7), (2, 8), (1, 9), (0, 10)). This responder stage was used to obtain a measure of disadvantageous inequality aversion, given that the minimum-acceptable offers in the UG is used as a point estimate of disadvantageous inequality aversion for each individual.
In the Ultimatum Game, the majority of participants showed a consistent choice pattern (i.e., up to 1 switching point, n = 234, 95.5%). In total, 11 participants had multiple switching points (up to 3), and again we used the softmax function described above for the Dictator Game to approximate their indifference point. For 1 participant the approximated indifference point was outside of the theoretically possible range [0,5]. This participant was therefore excluded from analyses that used the disadvantageous inequality aversion measure.
Like in the DG, we used the calculated IPs and estimated IPs in our behavioural analyses. To include individuals' disadvantageous inequality aversion in subsequent reinforcement learning models, we transformed the IPs from the UG following Blanco et al. (2011): α=IP/(2*(5-IP). If there was no rejected offer, we set disadvantageous inequality aversion to 0. If all offers were rejected, disadvantageous inequality aversion was set to 4.5. We used these transformed inequality aversions (α and β) to calculate the subjective pay-off matrix in our RL modelling (see Methods, section Computational modelling in the main text).

Computational models assessing behavioural adjustment in the non-social learning task
To assess behavioural adjustment outside of social interactions, we fitted the reinforcement models to decisions in the non-social setting (see Figure S5). Following the main text, we pool all data from participants per cohort and consider models with and without decay in learning rates across trials. As there are no other people involved, we do not consider models with or without social preferences, and assume that participants' priors in the first trial are such that they expect their computerized opponent to choose either action with equal (50%) probability.
We observe that models including a decay in learning rates lead to an improved fit. Figure S5 illustrates the estimated learning parameters of the best-fitting model. We observe that for each age cohort, learning rates decreased strongly over time. Interestingly, and in contrast to the Trust Game and the Coordination Game, the least strong decay was observed for the oldest age cohort. Figure S5. Estimated learning rates per age cohort from the best models for the Non-Social learning task Best models include decaying learning rates. Estimates learning rates as a function of the trial number, for each of the four age cohorts separately for the Non-Social learning task.

Recovery of computational models
To assess the robustness of the computational models presented in the main text, we performed a recovery analysis. For each of the social games (Trust Game and Coordination Game), and for each of the age cohorts separately, we used the parameters of the best fitting models (which included social preferences and a decaying learning rate) to simulate 100 mock data sets. The distribution of mock participants across cohorts was the same as in our behavioural data set. Subsequently, we fitted each of the four models (including and excluding social preferences and decaying learning rates) to the simulated data and compared their fits. This allowed us to examine the recoverability of our models.
For both economic games, for each of the 100 simulated mock data sets, the model including social preferences and decaying learning rates fitted the data best, showing that our best fitting model is recoverable.
1 Figure S6. Display of choice behaviour over trials per game, separately for each age cohort.

Participant instructions
Here we included the instructions for the Trust Game and the non-social learning task. Original instructions were in Dutch. The instructions for the Coordination Game are highly similar to instructions for the Trust Game, and are therefore not included here. Note that on almost every introduction screen for the behavioural tasks, figures of the task were included. Here the figures are only shown when necessary for understanding the accompanying text. The instructions also included some control questions; participants could only continue to the next screen when answered correctly. Full testing and instruction materials can be obtained from the corresponding author.

Instructions Trust Game
--The game you will play now has 3 parts. Each part has 30 short rounds. In each round you will make 1 choice. On the next screens, the game will be explained.
--In each round you play with another student from another school who also participated in this game.
In each round you will both make 1 choice.
In every round, the other is a new person.
--The game will look like this: . You are the purple icon on the left, and you can choose between the 2 purple arrows (A and B). The other is shown on the top of the screen, and can choose between the 2 top arrows (X and Y).
--In each of the four boxes you see dots. Each dot represents 1 point.
The points that you can win with your choice are always purple.
The points that the other player can win, have the other colour.
--In every round you will make a choice between A and B.
The other will make a choice between X and Y. Your choices together determine how many points you and the other win.

points (fill in) --That is correct!
After you see what you and the other have won, the round is over and you will play with a new person.

Important:
The choices of the other players are made previously by other students from another school.
--At the end of the game, the computer will select 5 random rounds.
Each point in those 5 rounds is worth 1 lottery ticket. After all games, we will draw 1 lottery ticket from all lottery tickets at your school.
The winner receives the gift voucher! So the more points you have, the larger your chance to win.
--The other players can also earn lottery tickets with their choices. With the lottery tickets they also have a chance to win a gift voucher at their school. Remember: your choices may affect the number of points you can win, but they may also affect the number of points the others can win.

--Environments
There are 2 environments of other players, each with their own colour. The colour of the other player's icon tells you what environment they are in.
In one environment, players usually choose X (left), and in the other environment players usually choose Y (right).
--In each round you will play with a new player from 1 of the 2 environments. --

Check Question
Suppose that you choose B and the other chooses X.
How many points would you win? (fill in)____ How many points would the other win? (fill in)____ --You have correctly answered the check question! There will now be 2 practice rounds before you start the real game.
Make your choice by clicking on 1 of your arrows. Then click 'Confirm' at the bottom of the page.
Next, you can see the other's choice, and how many points you and the other have won in this round. Then click 'Continue' to start with the next round.
--{2 practice rounds} --Check question Which environment did the Other player belong to?
If you don't remember, please click back.
--The game is about to start. Do you have any questions? Please ask the researcher now.
You can click start if you understand the game.

End of part 1
You are finished with part 1 of the game! --

Instructions Non-social learning task Part 3
In Parts 1 and 2 you were playing with other people. In Part 3, you will not be playing with other people. Instead, you will play with computers.
--The game will look like this: You are shown in purple on the left, and you can choose between the two purple arrows (A and B).
The computer can choose between the arrows on the top. Your points are the purple dots.
--As before, you will choose A or B (top or bottom boxes) We have programmed the computer to choose X or Y (left or right boxes) --Your choice AND the computer's choice together determine how many points you will win. Important: the number of points you can win with your choice is different than in the previous games. You would win ____ points (fill in) --That is correct! After you see what you have won, the round is over and you will play with a new computer.
--At the end of the game, 5 rounds will be randomly selected. Again, each point in the selected rounds is worth 1 lottery ticket.
Remember that you can win the gift voucher. The more points you have the larger your chance of winning.

--Environments
There are 2 environments of computers, each with their own colour. The computer's colour tells you what environment the computer is in. In one environment the computers usually choose X (left), and in the other environment the computers usually choose Y (right). In each round you will play with a new computer from 1 of the 2 environments.
--Click Start to start the game.

Prior expectations
At the end of the learning task instructions, right before the start of the task, we asked about the prior expectations of the participants. This was only done for the social games (Trust Game and Coordination Game).

Prior expectations Trust Game
--We are about to start the real game. Before we start, we have a few questions. -- We are curious about what you think of the other players in this game. Suppose that there are 10 other players, how many of these 10 do you think will choose X?

Prior expectations Coordination Game
--We are about to start the real game. Before we start, we have a few questions. -- We are curious about what you think of the other players in this game. Suppose that there are 10 other players, how many of these 10 do you think will choose X?