The Male Warrior Hypothesis: Testosterone-related Cooperation and Aggression in the Context of Intergroup Conflict

The Male Warrior Hypothesis (MWH) establishes that men’s psychology has been shaped by inter-group competition to acquire and protect reproductive resources. In this context, sex-specific selective pressures would have favored cooperation with the members of one’s group in combination with hostility towards outsiders. We investigate the role of developmental testosterone, as measured indirectly through static markers of prenatal testosterone (2D:4D digit ratio) and pubertal testosterone (body musculature and facial masculinity), on both cooperation and aggressive behavior in the context of intergroup conflict among men. Supporting the MWH, our results show that the intergroup conflict scenario promotes cooperation within group members and aggression toward outgroup members. Regarding the hormonal underpinnings of this phenomenon, we find that body musculature is positively associated with aggression and cooperation, but only for cooperation when context (inter-group competition) is taken into account. Finally, we did not find evidence that the formidability of the group affected individual rates of aggression or cooperation, controlling for individual characteristics.

• First, using a larger sample and a new population, we expect to replicate the previous results that indicate that intra-group cooperation and inter-group aggression are heightened in the context of intergroup conflict. • Second, the use of aggression among human males is related to T-dependent physical traits. We expect this relationship to be positive in the contexts of both dyadic (one-to-one standard PSAP) and inter-group conflict. Accordingly, we expect a positive relationship between developmental T and aggressive behavior in the PSAP task under the control condition and in the context of intergroup competition. • Third, similar to aggression, the use of cooperation is related to physical T-dependent traits, but in the opposite direction according to the context. More concretely, we expect a positive relationship between developmental T levels and cooperation in the intergroup conflict context, but a negative one in the control context, as previously reported 19 . • Finally, we predict that individuals in more formidable groups, measured as the sum of group muscle mass or as the muscle mass of the most muscular individual in the group (the potential leader), show higher levels of aggression and cooperation than individuals in less formidable groups, but only in the intergroup competition context.

Methods
Participants. Over two years, 246 young men (mean = 22.21 years, standard desviation = 3.20) from public universities in the 5 th Region of Chile were recruited. Individuals were usually recruited as a group of 6 members who therefore knew each other. Four individuals were excluded because they did not complete the participation in all the games. We chose young adults because intrasexual competition and aggression are more intense in that period of life 58 . At the end of the experimental protocol, participants received $15,000 Chilean pesos each (around $23 USD) for participating. They received an additional payment of up to another $15,000 pesos according to their performance in the games. Thus, participants could receive a maximum of $30,000 pesos, and in fact, 90% of the participants received that amount. We decided to give a significant amount of money ($30,000 pesos represents 10% of the minimum monthly wage in Chile) to ensure interest and reliable participation.
Ethics committee authorization and ensuring anonymity. The Institutional Bioethics Committee of the Universidad de Playa Ancha approved the research, including protocols and data treatment. All methods were performed in accordance with the relevant guidelines and regulations. Participants were asked to read and sign an informed consent form that detailed the procedure and the confidentiality steps. We used a standard coding process to preserve the anonymity of the participants 18,59 . All the participants signed the informed consent prior to their participation in the study.
Group formation, context manipulation, and the data collection procedure. Each group of 6 participants was randomly assigned to one of two treatments, an experimental condition in which the intergroup competition scenario was presented, and a control condition in which no mention was made about an intergroup threat. There were 20 groups of each condition. More details of the conditions of the games are provided below. The games were conducted in the Laboratorio de Comportamiento Animal y Humano (www.labcah.cl) of the University of Playa Ancha, Chile. This laboratory has six experimental cabins with computers connected in a local www.nature.com/scientificreports www.nature.com/scientificreports/ network. The cabins are isolated from visual and audio stimuli. This ensures a high level of reliability in the performance of games, prevents talking among participants and favors their concentration. The data were taken in two sessions for each group, with one week between sessions. The first day, we applied a sociodemographic questionnaire (i.e. sexual orientation and age), conducted the Public Good Game, and took anthropometric measurements. The next week, participants performed the Point Subtraction Aggression Paradigm and received their payment. Groups were usually composed of individuals who were familiar with one another. However, in a few groups, some individuals did not know each other because they were friends of friends. We statistically controlled this heterogeneity of the group composition in terms of friendship.
Prenatal testosterone was inferred from measuring right-hand fingers based on earlier studies that indicate that prenatal T is most reliably estimated by this method 29,60 . We followed the protocol proposed by Manning 32 , and replicated by Muñoz-Reyes et al. 61 . We took two measurements of all the fingers and used the mean value from the two measurements. Measurements were obtained from the basal crease of the finger to the tip of the 2 nd and 4 th fingers. We used a high-precision digital caliper (±0.01 cm). The resulting variance (SD = 0.001) was similar to that obtained for this index in previous studies SD = 0.03 in 60,61 , which indicates a good level of precision.
Indirect measure of pubertal testosterone. Facial masculinity. Facial photographs in frontal view were taken of all participants with a digital SLR camera (Nikon D7000) under standardized conditions, in terms of light and head orientation, focal length (3 m), shutter speed (1/60 s) and aperture (f/5.6). Any facial adornments were removed, and participants were asked to look straight into the camera with a neutral expression.
Facial masculinity was based on the facial width-to-height ratio (FWHR), which was calculated using the vertical distance between the highest point of the upper lip and the nasion. We also measured facial width using the horizontal distance between the left and right zygion (i.e., bizygomatic width, the maximum horizontal distance between the right and left facial boundaries). Landmarks were located manually with TPS software. Finally, we compared our manual measurements with those obtained by the software FACE++, which locates and returns high precision facial landmarks. We automated the use of this software through an algorithm in MatLab created by the eighth author, which is connected to the Application Programming Interface of FACE++. There was a high degree of correlation between our fWHR measurements and those obtained from the MatLab algorithm (r = 0.82). The results were the same with either of the two FWHR measurements. Given the high degree of correlation between these methods, we preferred to use the manual measurements due to their proven utility in previous studies 62 .
Body Muscularity. We followed the protocol used by Muñoz-Reyes et al. 55 . We first measured the participants' height in centimeters, barefoot, and with a manual stadiometer (SECA ® 203). We then used the InBody ® 370 body composition analyzer to estimate muscularity in kilograms. This device uses a tetrapolar 8-point tactile electrode to measure body composition by direct segmental multifrequency bioelectrical impedance analysis (DSM-BIA). This technique divides the body into five cylindrical parts before estimating impedance separately for each part, i.e., the four limbs and the trunk. The InBody ® 370 applies three frequencies (5, 50, and 250 kHz) to measure impedance in the five segments. This methodology has been validated to assess body composition 63,64 . Bosy-Westphal et al. 63 found that 97% of the variance in total SMM measured by magnetic resonance imaging was explained by SMM measured by DSM-BIA, whereas Ling et al. 64 compared measurements of total lean mass of men measured using DSM-BIA with those obtained from dual energy X-ray absorptiometry, and found an intraclass correlation coefficient of 0.96. In addition, we collected data on the participants' body mass index (BMI).
Behavioral measurements. For the baseline treatment, we use two experimental paradigms: The Point Subtraction Aggression (PSAP) and The Public Good Game (PGG). Whereas the PSAP paradigm has been used to elicit aggressive inclinations at the individual level in the context of dyadic one-against-one interaction, the PGG has been used to elicit cooperative dispositions in the context of a larger group social dilemma. These control conditions produce measurements of both cooperative and aggressive dispositions at the interpersonal level. The experimental conditions: the intergroup PSAP (IPSAP) and the intergroup PGG (IPGG) allows us to respectively explore how intergroup conflict modulates intergroup competition and intra-group cooperation. In all games where interaction was necessary with other men other than those in the group (i.e., dyadic and group conditions), participants were informed that they played with real people, although they were playing against a fictitious opponent (i.e., the software of the games). Measurement of aggression. The Point Subtraction Aggression Paradigm (PSAP). First applied by Cherek in the 80 s, the PSAP is a highly reliable tool to estimate aggression, especially in men 65 . It consists of a computer game in which participants play against a fictitious opponent. Individuals are told that the objective of the game is to score the maximum points, which are exchanged for real money at the end of the game. The participant's score is shown in a central monitor. Participants have three behavioral options that cannot be taken simultaneously: a) Gaining points: Participants gain 1 point by pressing button A 100 times. One point is equal to $1,000 Chilean pesos b) Aggression: Participants are informed that they can steal points from the other participant, but without gaining these points. Therefore, by pressing button B 10 times, they harm their adversary by subtracting one point, but without a concomitant increase in their own point total (i.e., stealing decreases the other player's score without increasing one's own). In addition, participants are told that their rivals get the points that are taken from them. To the extent that the only effect of stealing is to harm your opponent, stealing is consistent with the definition of aggression by Baron  www.nature.com/scientificreports www.nature.com/scientificreports/ c) Protection: Participants are told that their rivals can steal their points. Participants can avoid losing points by pressing button C 10 times, which protects them from points being subtracted in possible attacks during a fixed period of time.
We conducted a single 10-minute round. Participants under the control condition played the classic dyadic one-against-one version of the PSAP, while the participants under the experimental condition were told that they were part of a group competing with another group in a laboratory of another university located in the capital of the country. They were informed that each one was going to be paired with only one member of the competitor group, but that the winner would be the group that gained more points. The winning group would receive a bonus, equal to the points obtained by the losing group. This bonus would be split evenly between the members of the winning group. The losing group would only receive their individual points. Because the competitor group was fictitious, we always informed the participants that they had won the match, and gave them a bonus equal to 50% of the points obtained by themselves. To achieve more ecological validity and take into account the relevance of aggression in intergroup competition, but also for intragroup status, we followed the strategy used by Geniole et al. 65 . In this version, men are provoked intensely (i.e., participants lose 20 points per session). Aggression was calculated as the number of times button B was pressed as a percentage of the total number of times all the buttons were pressed. It is important to note that in our intergroup condition, conflict involves an outgroup threat with real potential consequences in terms of monetary payoffs, which the members of the group can collect by outcompeting the fictitious outgroup. We refer to this version of the PSAP as the Intergrupal PSAP (IPSAP).
Measurement of Cooperation. The public good game. As in any social dilemma, cooperating (any positive contribution) is a dominated strategy (i.e., a strategy that a selfish agent would never implement), but the absence of cooperation leads to an inefficient social outcome. Accordingly, the contributions of individuals can be used to assess their cooperative tendencies 67 . In the present research, we applied the protocol used by Van Vugt et al. 14 and replicated by Stirrat & Perrett 19 to measure changes in cooperation with the presence of intergroup conflict in the experimental condition. The public good game was played on computers using z-Tree software 68 .
Participants started the game with $5,000 Chilean pesos. They could decide how much to invest for the benefit of the group. They were told that they would receive a bonus of $11,000 pesos when total investment by the group exceeded $18,000 pesos, regardless of their individual contribution. However, no bonuses would be given if the group failed to contribute more than $18,000, and participants only gained the amount of money they decided not to share. Under experimental conditions, a group competed with another group for the bonus. As stated before, in the experimental procedure, the rival group was fictitious, although participants were informed that they played against a real group. The group of participants won the PGG if they contributed more than $18.000. Following previous research 14,19 , before playing the public good game, participants were provided with a complete description of the game and played a practice game. As part of a wider study, participants played two rounds of the game. The second round of the game was designed to study changes in contributions after winning or losing the first round. This was not included in this study as we were interested in the effect of the intergroup conflict in a one-shot cooperation. The difference between the design of Van Vugt et al. and ours is that, as in the IPSAP experimental condition, the outgroup threat involves real monetary consequences. In contrast, the strategy of Van Vugt et al. was that the introduction of intergroup conflict relies on priming inter-group competition. Specifically, PGG participants in the PGG from Southampton were told that the study was running simultaneously at 10 different universities in England.
To facilitate the interpretation of our results, let us discuss how the presence of inter-group conflict modifies the incentive structure that subjects face under the IPGG and the IPSAS experimental conditions. The only thing members of the group can do in the IPGG to outcompete other groups, and thus capture the winning prize, is increase intra-group cooperation. Alternatively, under the IPSAP scenario, besides continuing to gain points or defend against potential attacks (intra-group cooperation), stealing points from their opponents (inter-group aggression) also increases the likelihood of outcompeting their opponents. Data analyses. We conducted two t-tests with independent samples to test our first prediction by comparing mean rates of aggression and contribution according to whether individuals belonged to control or experimental groups. We also employed non-parametric Mann-Whitney U tests because rates of aggression and contribution in the PGG are non-normally distributed variables.
To test our second prediction, we fitted a general linear mixed model (GLMM) considering the following predictor variables: context (i.e., experimental or control), age, 2D:4D ratio, the facial width-to-height ratio, muscle mass, and body mass index. The rate of aggression in the PSAP was our outcome variable.
The third prediction was tested by a GLMM fitted to consider the following predictor variables: context (i.e., experimental or control), age, 2D:4D ratio, the facial width-to-height ratio, muscle mass, and body mass index. The outcome variable was the contribution. We expected an interaction between context and traits that denotes developmental T levels, and we took into account the interaction terms involving these variables.
To test our fourth prediction, we took the fitted model obtained for predictions 2 and 3 and tested whether the interaction of group formidability and context is significant. This assessed the potential effect of group formidability on individual expression of aggression and cooperation in a context of outgroup threat.
We used GLMMs to take into account the hierarchical nature of our data, in which we have individuals in groups and variables at the individual (e.g., muscle mass) and group levels (e.g., group formidability). We employed a step-up strategy to fit our models. In this procedure, all the predictor variables and the predicted interactions were compared individually with the null model. The variable that showed the best fit was introduced in the model. Next, we introduced the remaining variables one-by-one and compared their fit with the previous model (i.e., the reduced model). This procedure continued until no variables improved the reduced model. To compare nested models, we used the Akaike information criterion and the maximum likelihood estimation 69 www.nature.com/scientificreports www.nature.com/scientificreports/ We considered SMM and BMI together on the one hand and the facial width-to-height ratio and BMI on the other when fitting the models to control for the effect of BMI on SMM and the facial width-to-height ratio. The model related to the aggressive response in the PSAP showed non-normal residual distributions. We transformed the variable rate of aggression as it was very right-skewed by calculating its square root. This transformation solved the problem of the non-normality of the residual. However, because the fitted models were the same as the original and transformed variables, we show the results with the original variable. We employed the statistical package HLM 7 to perform the GLMMs and the IBM SPSS 21 for the t-tests. The level of significance was set at alpha = 0.05.

Results
Differences in aggression and cooperation according to the competitive context. Table 1 shows the descriptive statistics of the variables used in the study according to the context. Table 2 shows Spearman's correlation coefficients between these variables in the control and experimental contexts.
The results of the t-tests showed differences in aggression between the two contexts in the PSAP (t = −5.722, df = 214.03, p < 0.001). Individuals in the intergroup competitive context showed higher rates of aggression   www.nature.com/scientificreports www.nature.com/scientificreports/ test point to the same pattern (Mann-Whitney U test: U = 6153.50, n 1 = 119, n 2 = 123, p = 0.030). To further explore the effect of context in cooperation, we analyzed cooperation between participants according whether they invested at least the minimum amount to reach the threshold considering an equal contribution. We found that the effect of context on cooperation was non-linear. Whereas the percentage of individuals that decided to invest $3,000 pesos or more did not differ between conditions (Х 2 = 0.608, df = 1, p = 0.435), the individuals that invested $3,000 pesos or more in the control condition contributed less (mean = 3719.72, SD = 673.70) than those in the experimental condition (mean = 4002.51, SD = 762.63) (t = −2.742, df = 194, p = 0.007), but there were no differences among individuals that invested less than 3,000 (control condition: mean = 1852.00, SD = 832.85; experimental condition: mean = 1970.95, SD = 628.66) (t = −0.538, df = 44, p = 0.593).
We did not find differences between the contexts for any of the predictor variables (see Table 1). Table 3 shows the fitted model considering prenatal and pubertal markers of T levels according to the context. In addition, we considered age and BMI as control covariables. We found that muscular mass was a positive predictor of aggression in both contexts (B = 0.238, t = 2.294, p = 0.023; see Fig. 2). However, neither the facial width-to-height ratio nor the index 2D:4D were significant predictors of aggression in the PSAP, regardless of the context. We found a main effect of the context (B = −4.299, t = −4.282, p < 0.001). Individuals in the intergroup conflict context were 2.08 times more aggressive than individuals in the control condition when muscular mass and BMI were evaluated in their means (control context: mean = 3.971, SD = 0.717; intergroup context: mean = 8.270, SD = 0.702). These results further support the previous finding of the effect of context on aggressive behavior. Table 4 shows the fitted model considering prenatal and pubertal markers of T levels and their predicted interactions according to context. We also considered age and BMI as control covariables. We found an interaction between context and muscle mass.

Discussion
In this paper, we tested several predictions derived from the male warrior hypothesis 8,14 . First, we replicated previous results 14,15,19,71 about the importance of the intergroup conflict scenario in promoting cooperation within group members and aggression toward outgroup members. We then tested specific predictions about the hormonal underpinnings of male cooperation and aggression during intergroup conflict, concretely the role of an indirect measure of developmental T levels in both behaviors and in a context of intergroup conflict versus a control context without an outgroup threat. In this case, we found only partial support for our predictions since only muscle mass, an indirect marker of pubertal T levels, seems to be associated with aggression and cooperation in the predicted direction when the context is taken into account. Finally, we did not find evidence that the formidability of the group affected individual rates of aggression or cooperation, controlling for individual characteristics.
The male warrior hypothesis is founded on the importance of intergroup conflict for the reproductive success of individuals, especially men. This framework argues that men have physical and psychological traits selected in the context of intergroup competition. Several investigations have shown that men, in fact, tend to show ingroup favoritism and outgroup hostility, a phenomenon known as "parochial altruism" [72][73][74] . In this study, we replicated this finding showing that on average individuals were more cooperative in a public good game, that is, they contribute more to the common pool when they competed against another group in order to first reach a threshold, than when they played the game in order to reach the same threshold but without the threat of an outgroup. Further, although most individuals contributed 3,000 Chilean pesos or more, which was the minimum amount to reach the threshold considering an equal contribution, we found that among individuals that contributed 3,000 pesos or more, those in the experimental context behaved more altruistically, that is, they contributed more than those in the control condition. The contributions of individuals that did not invest at least 3,000 pesos were not   www.nature.com/scientificreports www.nature.com/scientificreports/ different between the two conditions. In other words, individuals in the experimental context decided to invest far more than the minimum to reach the threshold under an equal contribution. If one assumes that there is an implicit norm under this scenario, namely to contribute 3,000 pesos, the difference between both treatments is driven by the supererogatory behavior of some agents whose extra contribution could be understood as a status-seeking strategy. Moreover, individuals behaved more aggressively during the PSAP when they were collectively competing against another group (experimental condition) than when they were competing individually against an individual from an outgroup. Then, aggression was heightened by intergroup conflict even if this aggression was in some sense spiteful, as it was costly for the aggressor and the receiver. In fact, rates of aggression correlated negatively with profits in our study, and the average benefits of individuals in the intergroup conflict context were lower as a consequence. Our results support previous findings about the importance of intergroup competition in cooperation with members of the group and aggression against the members of a competing group.
We were also interested in the hormonal underpinnings of this phenomenon. Testosterone is an androgenic hormone that, among other functions, has been proposed to be a key factor in calibrating cooperative and aggressive responses in different contexts 9 . More concretely, the indirect measures of developmental T levels are expected to be associated with aggressive responses in general and intergroup conflict scenarios 37,38,41,42 . In this study, we tested the indirect effect of developmental T levels with rates of aggression in a context of intergroup conflict and in a control context. First, we found some support for the claim that an indirect measure of developmental T levels is important in determining levels of aggression. Concretely, we found that body muscularity is a positive predictor of rates of aggression in the intergroup conflict scenario, as well as in the control scenario. However, we did not find any effect of the fWHR or 2D:4D. The fWHR has been related to aggression, both self-reported and as measured by the PSAP 30 , although the relationship between the fWHR and aggression may be moderated by social status 75 . However, a recent study that evaluated the fWHR and bicep circumference found that bicep circumference was a significant predictor of aggression in the PSAP, while the fWHR was not 76 . Our results are in accordance with those of the latter study in suggesting that body muscularity is the key factor related to aggression and that the fWHR is a correlate of potential physical threat 76 . Evidence of the relationship between 2D:4D and aggression comes mainly from self-reports of aggressive behavior rather than behavioral measurements obtained in a laboratory paradigm, and their effects are small 37 . In our study, 2D:4D was not a significant predictor of aggressive response in the PSAP in either condition. 2D:4D is probably a reliable indicator of psychological predisposition to aggression, but the real physical power of individuals limits this predisposition under realistic conditions. Using a war game, McIntyre et al. 38 found more strategic than direct use of aggression. Another explanation of this null result can be found in the proposal of Manning et al. 29 of an indirect effect of 2D:4D on competitive/aggressive behavior, in which 2D:4D predicts spikes in circulating T, which is a promotor of aggressive behavior through increases in T. Future studies should include measuring circulating T and other forms of aggression to discard or demonstrate a role of 2D:4D in aggressive behavior in the scenario of intergroup conflict.
Regarding cooperative behavior, we found that muscle mass was an important variable in determining individual levels of contribution in the Public Good Game. The effect was moderated by context, that is, more muscular men behaved more cooperatively in the intergroup conflict condition and less cooperatively in the control condition. This supports the prediction that the indirect measure of developmental T levels enhances ingroup cooperation when facing an outgroup threat. These results are similar to those of Stirrat & Perret 19 , who showed that the fWHR is related to contributing more or less in a Public Good Game, depending on the context (between groups versus within groups). However, we tested both fWHR and muscle mass and only found a significant effect of muscle mass in cooperation. This result adds further evidence about the importance of the intergroup context in moderating the relationship between the indirect measure of pubertal T levels and cooperation in men and suggests that muscle mass plays a more prominent role in cooperation during intergroup conflict. Less muscular men behaved more cooperatively under our control condition. A possible explanation is that our indirect measure of pubertal T levels, muscle mass, enhances anti-social behavior when there is no outgroup threat. The latter probably occurs because physical power is a reliable cue for fighting ability 18,77,78 that serve to subdue ingroup rivals and acquire social status and benefits through the use of non-cooperative displays (for instance, the use of anger in 53 . Given that most individuals contributed at least 3,000 pesos in the Public Good Game, that is, most individuals cooperated in the game, we suggest the control condition can be understood as a scenario based on the balance of cost-benefits of obtaining social status through prestige. In the context without outgroup threat, individuals with traits denoting lower indirect measure of pubertal T levels have the opportunity to gain status through prestige in a context in which competitive traits are less important 24 . However, when cooperation is triggered by intergroup competition, individuals with traits denoting indirect measure of pubertal T can be expected to cooperate more 14,19 . We speculate that this is a reliable way to maintain previously acquired ingroup status, and to maintain group structure. According to prenatal T, we did not find any relationship between 2D:4D and cooperation. Although previous studies have reported inconsistent results about the relationship between 2D:4D and cooperation [44][45][46]79,80 , it is plausible that indirect measure of prenatal T has a more indirect role in the expression of cooperation in the intergroup conflict scenario, as with aggression. In a study with female subjects 81 , inoculating testosterone only increased cooperation among participants with a lower 2D:4D ratio.
We did not find evidence that individuals adjust their behavior according to the formidability of the group. This is a key prediction of the male warrior hypothesis 8 , which, to our knowledge had not been tested until now. There are several possible explanations for the absence of any indication of the influence of group composition. Our expectation that individuals would adjust their behavior in the same way according to the composition of the group may have been too simplistic. It is possible that the effect of group formidability only manifests itself with a few individuals who act as leaders, while weaker individuals rely on the strength of the leaders. Another explanation is that because members of one group are not able to assess the formidability of the outgroup, they calibrate their behavior based on their day-to-day experience about the formidability of the group, which may vary among individuals in the same group and among groups. In any case, previous studies suggest that individuals can assess the formidability of other groups. Therefore, future studies should focus on determining if this ability translates into calibrating individual behavior according to group composition compared with the outgroup.
Our study has several limitations that have to be considered in appraising the scope of our results. First, we have not measured the degree of friendship among the participants. However, our statistical design allowed us to control for variability in aggression and cooperation between groups. In any case, we are beginning to test this variable for a new project, and although it is clear that an interaction scenario affects the behavior of participants, it would be interesting to test the possible effect of the previous history of the relationship on strategic behavior in the context of both aggression and cooperation. In addition, we have not considered the inclusion of psychological variables that could moderate the expression of the tested behavior, such as self-perceived social status, general aggression, and social value orientation. Finally, although we have demonstrated that outgroup aggression and ingroup cooperation are exacerbated among men engaged in intergroup contests, we cannot deny the possibility this effect is also present among women. In the future, we expect to study this issue among women.
To conclude, our results from applying an experimental design under controlled conditions support one of the main predictions of the male warrior hypothesis, that aggression and cooperation are heightened in groups of men in the context of intergroup conflict. Our analysis of the effects of developmental testosterone on aggression and cooperation further supports the notion that body muscularity is an important trait that influences the intensity of aggressive responses under provocation, as measured in the PSAP. Notably, the relationship between body muscularity and aggression is not dependent on the context (intergroup conflict versus control). In our study, the context of intergroup aggression increased the rates of aggression independent of testosterone. In other words, men seem to increase aggression regardless of their body muscularity. Our results indicate that context influences and increases cooperative behavior in men. Muscularity affects cooperativeness, more muscular men being more cooperative than less muscular counterparts in the intergroup scenario, but with the reverse effect in the control situation (only with the intragroup scenario).
Future studies are needed that include circulating T andanalysis of this effect in interaction with anthropometric indicators of developmental T levels. More complex experimental designs are needed to include the assessment of the rival group's formidability. The mechanisms (if they exist) are more likely to be noted if a visual comparison is made before the competition.

Data availability
The data set generated and employed in this study is available from the corresponding author upon request.