Sociality predicts orangutan vocal phenotype

In humans, individuals’ social setting determines which and how language is acquired. Social seclusion experiments show that sociality also guides vocal development in songbirds and marmoset monkeys, but absence of similar great ape data has been interpreted as support to saltational notions for language origin, even if such laboratorial protocols are unethical with great apes. Here we characterize the repertoire entropy of orangutan individuals and show that in the wild, different degrees of sociality across populations are associated with different ‘vocal personalities’ in the form of distinct regimes of alarm call variants. In high-density populations, individuals are vocally more original and acoustically unpredictable but new call variants are short lived, whereas individuals in low-density populations are more conformative and acoustically consistent but also exhibit more complex call repertoires. Findings provide non-invasive evidence that sociality predicts vocal phenotype in a wild great ape. They prove false hypotheses that discredit great apes as having hardwired vocal development programmes and non-plastic vocal behaviour. Social settings mould vocal output in hominids besides humans.

H umans acquire language from their individual linguistic communities. Experiments manipulating individuals' social setting-from solitary social isolation to social groupinghave demonstrated that the degree of sociality experienced by songbirds [1][2][3][4][5][6] and marmoset monkeys 7-11 also determines how their vocal repertoire develops and matures. These findings have made these species favoured lab models for the study of (spoken) language evolution 12,13 . However, evolution is a path-dependent process that builds upon a lineage's biology and behaviour, where homology is critical for the reconstruction of ancestral states and insight into their ensuing evolution. Given that songbirds and marmosets are distantly related to our own phylogenetic family, without similar data from our closest living relatives-the (nonhuman) great apesour understanding of why language transpired in our own clade but none other in 525 million years of vertebrate evolution will probably remain imperfect.
Laboratorial protocols involving solitary social isolation as conducted with songbirds and marmosets are not, however, ethically permissible with great apes. Personhood rights may extend to these species [14][15][16] , and their survival status in the wild is critical [17][18][19][20][21][22] (International Union for Conservation of Nature, Red List of Threatened Species, 2021). In the absence of evidence from social manipulation experiments, great ape vocal phenotype has been presumed siloed from social influence, and their vocal production and repertoire posited as innate, automatic and hardwired [23][24][25] . Enigmatically, these notions fundamentally contradict the role of shared ancestry in biological evolution and lead to notions of language emergence as a non-continuous process 23,24,26,27 . These traditional notions derive in part from historical great ape language projects [28][29][30][31] , which reportedly failed to teach great apes to speak. Paradoxically, however, their study subjects lived in home labs with impoverished (if any) social contact with conspecifics 32,33 . While positive evidence from these individuals' capacities (that is, 'things they can do') can be instrumental for improved heuristics of human evolution [34][35][36][37][38] , negative evidence (that is, 'things they cannot do') is not generalizable 33 . Indeed, several recent human-ape interactional experiments in accredited zoos have now demonstrated that great apes exert fine real-time voluntary control over all the necessary structures required for speech production, including laryngeal control [35][36][37]39 , that their repertoire is composed by vowel-like and consonant-like calls 33,[40][41][42][43][44] and that they can produce these calls with a speech-like rhythm 34,45 . A new framework for the gradual evolution of spoken language in the human clade from an ancestral hominid repertoire and vocal system is, therefore, gaining predominance 42,46-56 . The last limitation in this growing body of evidence and the view that great apes are highly desirable models for language evolution research is arguably the fact that most data for vocal (production) learning have thus far derived from captivity [35][36][37]39,52,53,[57][58][59] (cf. 60,61 ). Individuals' social setting in captivity is artificial and relatively monotonous and therefore limits the full expression of animals' natural predispositions and potential phenotypes, making data from the wild paramount. There is extensive evidence for social learning across behaviour domains and for different types of great ape culture in the wild [62][63][64][65] . Although most research effort has focused of material cultures, there is no theoretical reason to believe that social effects would operate in starkly different ways with vocal and communicative behaviour. Great ape vocal research in the wild is inherently difficult and time intensive, but evidence for local traditions in sound communication [66][67][68][69] and call cultures 46,60,61,70 is steadily accumulating across great ape genera, even if great ape behavioural richness is eroding with human impact, and multiple local traditions should be assumed already extinct 71,72 .
In humans, individuals' social setting determines which and how language is acquired. Social seclusion experiments show that sociality also guides vocal development in songbirds and marmoset monkeys, but absence of similar great ape data has been interpreted as support to saltational notions for language origin, even if such laboratorial protocols are unethical with great apes. Here we characterize the repertoire entropy of orangutan individuals and show that in the wild, different degrees of sociality across populations are associated with different 'vocal personalities' in the form of distinct regimes of alarm call variants. In high-density populations, individuals are vocally more original and acoustically unpredictable but new call variants are short lived, whereas individuals in low-density populations are more conformative and acoustically consistent but also exhibit more complex call repertoires. Findings provide non-invasive evidence that sociality predicts vocal phenotype in a wild great ape. They prove false hypotheses that discredit great apes as having hardwired vocal development programmes and non-plastic vocal behaviour. Social settings mould vocal output in hominids besides humans.
To assess the influence of sociality on great ape vocal phenotype and resolve the existing empirical deadlock in the field of language origin and evolution, here we transpose from the artificial setting of the laboratory to the natural social arena of the wild and embark on the largest cross-populational analyses conduced in great ape vocal research to date. We capitalize on 'natural experiments' that have exposed wild orangutans to different degrees of sociality as residents of populations with different orangutan densities. According to the traditional hypothesis that great apes are incapable of vocal (production) learning and poor models of language evolution research [23][24][25] (cf. 28,29 ), individuals should operate as independent agents and their vocal phenotype should take course without influence of social and vocal input. If the traditional hypothesis is correct, one should expect that natural differences in sociality between wild great apes should show no correlation with the gamut and acoustic range of call variants produced by great ape individuals.

Rationale
Transposing experimentally from songbirds and marmosets in the lab to great apes in the wild requires accounting for three major issues: social proxies, study designs and socio-ecological confounds.
Social proxy: populational density. Orangutans exhibit a fissionfusion social system without permanent social groups (besides long-term mother-infant associations) 73,74 and instead tend to organize in loose female communities with roving adults males 75,76 . This type of social organization typically leads to the exclusion of orangutans from cross-species comparisons because social measures used with other primates simply do not apply 77 . Hence, the degree of sociality here-capturing the probability for social and vocal input-was measured by the number of individuals per unit of area (km 2 ) at each population (that is, orangutan density) 78 . Indeed, higher orangutan densities are associated with higher average percentage of time spent with other independent conspecifics 79 . At the same time, if the opposite were true (that is, higher density without higher social contact), one would predict diminishing home range sizes, which is not observed; higher population densities are associated with more females sharing larger expanses of their home ranges 80 . This confirms overall that density can be used as a surrogate and operable metric of sociality with wild orangutans.
Study design: from longitudinal to cross-sectional. In the lab, studies conducted with songbirds and marmosets have been longitudinal, where infants' vocal development is closely followed through time. In these studies, the effect of social vocal input as a catalyst of vocal changes has been assessed through the measure of a single call's acoustic entropy. This parameter gauges the level of disorder in a sound by analysing a call's energy distribution. Comparing acoustic entropy across time allows for tracking how an individual hones a call's mature/adult/tutored version. But this requires extensive and regular recordings of an individual's vocal behaviour, best achieved with a rapidly developing species in a fixed and predictable environment. Moreover, acoustic entropy is highly sensitive to ambient noise, which can tamper with measures of acoustic energy distribution by adding spurious energy bursts, peak or bands. This requires recordings to be collected in low and/or constant levels of background noise and unchanging acoustic settings.
Conversely, great apes exhibit the slowest development, reproduction rates and generational turnover among the extant primate species with orangutans' life history being slower than that of humans [81][82][83] . Very few long-term field sites have been able to operate uninterruptedly and follow the development of specific individuals as they age [84][85][86] . Alas, currently, there is no available audio database spanning years of observation at the same location for orangutans or any other great ape. In addition, great ape observation in the wild is not under human control in a similar way as experiments are and must adhere to strict guidelines to assure that individuals remain wild. For example, in orangutan habitat, noise levels and acoustic settings are constant, variable and unpredictable, rendering unreliable any analyses based on acoustic entropy. Moreover, to avoid human over-habituation, an orangutan focal individual can be followed for only 5-10 days, after which they cannot be followed for another month with no expectation of when or whether they will be encountered again. This inherently renders unviable any attempt to systematically and regularly follow individual vocal behaviour and development. The wild thus poses contrasting opportunities and conditions for audio recordings in comparison with captivity; data collection is noise laden, sporadic, opportunistic and cross-sectional.
As such, to surpass the limitations imposed by lab-based methods when applied to the wild, we characterize orangutan vocal phenotypes by measuring individuals' 'repertoire entropy' . Repertoire entropy was calculated across an individual's call repertoire (instead of individuals' single calls as for acoustic entropy) using three entropic parameters: emergence, self-organization and complexity 87,88 . Each of these parameters gauges the distribution probability of novel or conserved call variants within a given set of calls produced by an individual, expressing the variation regime within that repertoire. Accordingly, these parameters do not measure 'raw acoustics' (as in acoustic entropy), but the rate at which calls with similar/distinct acoustics occur. Emergence defines the rate at which new acoustic states (a call variant) appear in a system (an individual's call set/repertoire), with higher values expressing higher rates of original/generative vocal production and vice versa. Self-organization defines the rate at which similar acoustic states appear in an individual's repertoire, with higher values expressing higher rates of conserved/ conformist vocal production and vice versa, where self-organization is inversely proportional to emergence. Complexity defines the balance level between emergence and self-organization in an individual's repertoire; when new acoustic states emerge and are subsequently preserved through repetition (that is, conserved vocal production), over time, that system raises its average number of different states and, hence, its complexity (Supplementary Data 5).
Socio-ecological confounds: ecological. In the lab, different populations can exist and survive in different demographic densities accompanied by virtually no variation in ecological setting. This is because individuals' nutritional and energetic requirements are met by human artificial food provisioning. Conversely, in the wild, high-density populations will probably emerge in ecological habitats inherently more productive. Accordingly, food calls could be potentially affected by or reflect ecological differences between populations instead of differences in sociality between individuals. Therefore, food calls should not be considered for analyses of repertoire entropy. Unlike other great apes 89-92 , orangutans do not produce food calls 93 , but flanged male orangutans can long call upon arrival at a food patch, and so long calls and, conservatively, other call types exclusive to flanged males should also be excluded.
It has also been experimentally demonstrated that forests with different levels of plant productivity (for example, Sumatra vs. Borneo 94 ) and different structural architecture (for example, low mountain rainforest vs. peat swamp) affect sound and information propagation of different orangutan call types in similar fashion 47,95 . Effects due to ecological differences in habitat physical structure can, thus, also be assumed absent or negligible between different areas of orangutan territory.
Socio-ecological confounds: social. In the lab, individuals can be socially 'staged' so they can establish vocal contact with others without social contact. This assures that call variation reflects the degree of vocal input instead of the kind of social interaction. In the wild, vocal input and social contact are, however, often inseparable. Duration-based histogram Normalized entropy

Context Maximum frequency Duration
Maximum frequency-based histogram Dashed lines indicate the manual selection from which kiss-squeak maximum frequency (mxf) and duration (dur) were extracted, and how the two acoustic parameters were them processed to calculcate their corresponding entropy parameters per individual per context, where E is emergence, S is self-organization and C is complexity. P and p are probabilities, K is a constraint that constrains E, S and C, H is normalized entropy and y represents a call variant. (Methods and Supplementary Data 5).
Consequently, it is conceivable that living in high-density populations could lead individuals to engage in different types of social contact and, hence, different types of vocal interaction. Accordingly, social calls could potentially be affected by or reflect differences in social interaction between individuals instead of degree of vocal input. Therefore, social calls exchanged between conspecifics should not be considered for analyses of repertoire entropy.
Orangutans also exhibit call cultures in the wild 60,67,68,93 . These are not instances of geographic variation in the same call type 46 as reported across primates and other mammals 96 . Notably, some orangutan call types are exclusive to one population, whilst other populations exhibit an acoustically distinct 'synonym' call type produced in the same context and function, whereas other populations exhibit no vocal signal for that same context or function. Currently known cultural calls include (mother-infant) social contact calls and calls produced during nest construction 60 . Because these call types are local specific, they should also be excluded from analyses.

Final empirical setup
Accordingly, to prevent ecological and social confounding effects, we analysed orangutans' primary alarm call, the kiss-squeak 93 . This call type is universal across, and prevalent within, every wild population studied thus far. It is one of the most frequently produced calls by wild orangutans, providing relatively ample sampling, notably, towards human observers-a context virtually equal across populations and de-correlated from any orangutan social, ecological or demographic variables. Kiss-squeaks are predator-oriented alarm calls 49,67 and produced comparably by populations exposed to different predator guilds 97 . (Occasionally, they can be given towards other orangutans; thus, these cases should also be excluded from analyses (Methods).) Kiss-squeaks carry over dense forests up to 100 m without losing informational content 47 and thus can be detected, heard and monitored by conspecifics who are within earshot but not interacting socially with the senders. Kiss-squeaks provide, thus, a rare occasion in the wild where vocal input is neither socially motivated nor inextricable from social interaction, further liberating analyses from possible social confounds.
In sum, to study the effects of sociality on the expression of the orangutan vocal phenotypes in the wild, we used a two-island cross-populational cross-sectional study design. We assessed individual vocal phenotypes by calculating repertoire entropy for each individual's kiss-squeak repertoire (N individuals = 76; N calls = 5,290; N populations = 6; N observation hours >6,120; Fig. 1, Supplementary Data 1). Namely, we calculated entropic emergence, self-organization and complexity (Fig. 1, Methods and Supplementary Data 5) based on maximum call frequency (Hz; that is, that of highest dB; N = 69) and duration (s; N = 69) separately for each individual per context (Fig. 1, Methods and Supplementary Data 2, 3 and 5). To quantify the effect of sociality on repertoire entropy, we conducted four linear mixed models, each with one of the entropic measures as a response variable (2 frequency-based + 2 time-based; 2 for emergence/ self-organization + 2 for complexity). Each model included sex (two levels: female, male), age-sex class (five levels: infant, adolescent, adult female with infant, unflanged male, flanged male), species (two levels: Bornean, Sumatran), context (four levels: towards: observers, animals, humans (non-observers), no apparent danger) as control fixed factors and orangutan density as our main factor of interest. Individual ID was included as a random effect to weigh out individuals contributing several data points (Methods and Supplementary Data 4).

Results and discussion
Orangutan density-a surrogate measure for degree of socialityhad a preponderate effect on individuals' repertoire entropy (Table 1 and Fig. 2), rejecting the traditional hypothesis that great ape vocal phenotype is impervious to social settings. Frequency-based and time-based emergence (that is, 'rate of original calls') and self-organization (that is, 'rate of repetitive calls') were significantly correlated (positively and negatively, respectively) with orangutan density. That is, across six wild populations, individuals living in higher densities were vocally more original and acoustically more unpredictable than individuals living in lower densities, who instead were vocally more repetitive and acoustically more conformative. Additionally, frequency-based and time-based complexity were significantly correlated with orangutan density with individuals living in low densities exhibiting more complex call repertoires than those living in higher density populations (Table 1 and Fig. 2). It should be noted that these relationships were not an artifact of a smaller number of individuals or calls sampled in the low-density populations or vice versa but instead features of signal variation per individual per context (Methods and Supplementary Data 5).
For frequency-based repertoire entropy, species was the control factor with the strongest effect (Table 1 and Supplementary Data 4). Bornean orangutans were vocally more original and exhibited a more complex repertoire than Sumatran (Supplementary Data 4), which could reflect overall higher forest productivity in Sumatran (hence, higher orangutan densities) than in Borneo 94 . For time-based repertoire entropy, call context was the control factor with the strongest effect (Table 1); however, there were no substantial differences between contexts (Supplementary Data 4).
Results show that an orangutan's 'vocal personality'-being vocally original vs. vocally confirmative-was predicted by that individual's degree of sociality. This effect pertained to alarm calls directed to potential danger and excluded calls produced towards other orangutans. Strictly limiting our analyses to these calls allowed us to curtail possible socio-ecological confounds. Findings show that even in the absence of social interaction or direct vocal exchange, the weight of an individual's social and vocal landscape is sufficient to shape individuals' own vocal output type and variability regime. Individuals in populations with a lower density also exhibited more complex vocal repertoires. This is in line with population models of cumulative cultural evolution that show that the best breeding grounds for the accumulation of new traits through social learning are dispersed populations with intermittent contact 98 . This is a reminiscent demographic dynamic to the fissionfusion social organization of wild orangutans and that of ancient humans in the African continent 99 . Indeed, ecological changes towards drier habitats brought about by palaeo-climate change in the African continent 100,101 were unlikely to have sustained densely populated communities in the wake of human evolution 102 . Results agree, thus, with computational models, statistical analyses and phylogenetic reconstructions showing that 'social intelligence' was not an evolutionary driver for human (brain) evolution as much as once believed [103][104][105] .
Some of the vocal dynamics observed contrast with those of captive songbirds and marmoset monkeys: the latter show increased call consistency from young to adult age, whereas we observed the opposite pattern in wild orangutans. Several (non-mutually exclusive) factors may help explain these differences. First, number of tutors probably affects vocal dynamics of novices. For example, marmoset infants attend to one or two tutors during development, but young orangutans seek interaction with multiple adult conspecifics as they gradually become independent 85,106-110 , becoming exposed to larger pools of 'role models' for the acquisition of new behaviours and skills across domains 73,74 . Indeed, when songbirds were experimentally presented with an increased abundance of role models, similar results were obtained 4 . Second, the role of sociality on vocal development in songbirds and marmosets has been observed in transient call types, calls that play a role in supporting vocal development but that are not retained themselves in the mature repertoire 7 . This contrasts with the orangutan calls analysed here; once present in an individual's repertoire, kiss-squeaks are retained in the adult repertoire. Third, life in the wild presents stimuli that are otherwise absent in captivity. For example, by the time a captive infant matures, the range of possible situations that it might encounter in life has been greatly exhausted. This is known to lead to decreasing behavioural variability and potentially to (pathological) stereotypies in captivity. Conversely, the probability of new circumstances in the wild increases once an individual matures and gradually acquires independence, particularly in species with fission-fusion social organization who roam over extensive territories such as orangutans. Wild marmoset studies could help establish a comparison with lab marmoset studies and directly determine wild vs. captivity effects. Finally, acoustic entropy was used in lab studies whereas we used repertoire entropy in the wild. It will be important to determine in the future whether or how entropy at these two levels may be interrelated.
To date, all orangutan study sites have experienced some degree of human impact 17,111,112 , particularly in recent decades 20 (Table 1 and Supplementary Data 4). Graphic representations are based on raw data; differences between density levels are based on model estimates.
Species 2021). For example, our sample included a Sumatran population that lived in a human-dominated degraded landscape 113 that has now become locally extinct (Sampan Getek). The densities reported here have, therefore, not been shaped over millions of years of evolution. The observed correlation between vocal phenotype as a function of sociality corroborates, therefore, the view that the mechanisms at work here operate at a time scale within individual lifetimes, and thus do not reflect automatic, hardwired development programmes shaped by local adaptation over evolutionary time frames.

Concluding remarks
Our findings show that the degree of sociality experienced by individual orangutans in the wild moulds their vocal personality. Findings converge with evidence for active social learning in wild orangutans 109,110,114 that suggest that socially sourced information crosses over into the vocal and communicative domain. We confirm that like human learners exposed to different linguistic communities, social settings help modulate vocal output dynamics and structure in nonhuman hominids. Future models of language origin and human evolution must account for sociality effects on vocal phenotype expression. Extending at least as far back as the phylogenetic rise of the hominid family, low-density populations provided better breeding grounds for high vocal variant complexity.

Study sites.
This study was conducted across six research stations: Tuanan, Gunung Palung and Sabangau in Borneo (Pongo pygmaeus wurmbii) and Sikundur, Sampan Getek and Suaq Balimbing in Sumatra (P. abelii). This study entailed 2,510 observation hours at Tuanan, 1,520 at Gunung Palung, 311 at Sabangau, 1,132 at Sikundur, 498 at Sampan Getek and 149 at Suaq with a grand total of 6,120 observation hours between 2005 and 2010 and a minimum of five months of uninterrupted orangutan follows and recordings at each site. All sites are laid across the Equator's vicinity and more than 3,000 km away from the Tibetan Plateau. Seasonality is therefore low and without pronounced raining/monsoon vs. dry seasons. No significant effects are hence expected to have arisen due to data having been collected during different overlapping periods/seasons of the year across sites, particularly for calls neither directly nor indirectly related to feeding contexts (for example, food calls and social calls at food patches, respectively). Population estimates were also calculated during these years. Orangutan generation length is typically longer than that of Pan and Gorilla 115 , that is, >25 years; therefore, no significant differences in orangutan density should be expected to have arisen or been biologically possible to have arisen between year of census and year of data collection at each site.
Data recollection. All orangutan kiss-squeaks were opportunistically recorded while following subjects typically at 7 m to 30 m distance from the individuals. Only unaided variants of kiss-squeaks were addressed in the study because other variants are only present in some populations (that is, hand and leaf kiss-squeaks were not considered) 67,68,93 . Calls were recorded at Tuanan using a Marantz Analogue Recorder PMD222 (Marantz Corp.) in combination with a Sennheiser Microphone ME 64 (Sennheiser electronic GmbH & Co. KG) or a Sony Digital Recorder TCD-D100 in combination with a Sony Microphone ECM-M907 (Sony Corp.). In all remaining sites, calls were recorded using a Marantz Analogue Recorder PMD-660 or a ZOOM H4next Handy Recorder (ZOOM Corp.), both connected with a RODE NTG-2 directional microphone (RODE LLC). Audio data were recorded in 16-bit Wave format. No meaningful differences in audio input were expected to result from different professional directional microphones. Audio recordings were collected simultaneously with complete focal behavioural data on the focal animals and other conspecifics when in association. Data collection involved no interaction with or handling of the animals and strictly followed the Indonesian law and research station mandatory guidelines. Orangutan density values were extracted from Husson et al. 78 .
Recordings were transferred to a computer with a sampling rate of 44.1 kHz. Duration (s) and maximum frequency (Hz; that of highest dB) were extracted using Raven interactive sound analysis software (version 1.5, Cornell Lab of Ornithology) using the spectrogram window (window type: Hann; 3 dB filter bandwidth: 124 Hz; grid frequency resolution: 2.69 Hz; grid time resolution: 256 samples). Both parameters were extracted directly from the spectrogram window by manually drawing a selection encompassing the complete call from onset to offset.
Data analyses: entropy-based parameters and calculations. Loosely speaking, a complex system can be understood as a dynamical system composed of many elements that display functional/spatial/temporal patterns that cannot be derived from its components by themselves 4,5 . Rather, these components and their future are partially determined by their interactions. There are several frameworks to characterize a system's complexity. From these, statistical Shannon-based complexity measures can be employed to determine the complexity of a system using its states' probability distribution. Particularly, the framework proposed by Santamaría-Bonfil and colleagues 88,116 characterizes a system's complexity, either discrete or continuous, as the trade-off between emergence (that is, the appearance of new systems states) and self-organization (that is, regular patterns in the form of highly probable system states). Here we limit the formal definition of complexity measures (emergence (E), self-organization (S) and complexity (C)) to its discrete form: where p i = P (X = x) is the probability of the element i. Moreover, K is a normalizing constant that constrains E, S and C within 0 ≤ E; S; C ≤ 1 and is estimated as where b corresponds to the system's alphabet size, the number of states a system can exhibit. It is worth noting that C is only maximal (that is, C = 1) when E and S are equal (that is, E = S = 0:5) and becomes zero for equiprobable or Dirac delta distributions. In systems with more than two states, a high C implies that the system concentrates its dynamics into few highly probable states with many less frequent states (for example, a power-law distribution; Fig. 1 and Supplementary Data 5).
We organized orangutans' acoustic measures into sets per population, individual and context. Afterwards, for each set we calculated the respective entropy-based measures for call's duration (D) and maximal frequency (F) using openly available tools 4 as follows: For each ith individual from the jth population under the kth ecological context (that is, x k i ∈ Pj), we obtained its corresponding E, S and C for duration (D) and maximal frequency (F) such as: Although frequency and duration measurements are continuous, the number of calls per individual in many cases limited the approximation of the empirical probability distribution of these (by means of a kernel density estimation method), leading to spurious results for continuous complexity measurements. Therefore, first we approximated call duration and maximal frequency probability distribution through a histogram ( Fig. 1 and Supplementary Data 5). Next, we employed discrete complexity measures as mentioned earlier.
As can be observed in the R code notebook (Supplementary Data 5), in general, orangutan individuals' calls range from low to very high complexity. In the case where individuals had only one record per context, these are regarded as completely self-organized, thus E = 0, S = 1 and C = 0, which can be observed by a group of individuals (for example, Ronaldo, Freddy, Tina and so on). These cases were excluded from subsequent analyses (reduction of N = 106 to N = 89); together, entropy measures were based on three or fewer calls, as these were expected to provide insufficient coverage of the possible acoustic states for an individual's call variation within a given context (N = 89 to N = 77). The entropy values that had been calculated for the context 'towards other orangutans' were also removed to avoid including any calls directly exchanged between conspecifics in our analyses to avoid any social confounds as explained in the Introduction (N = 77 to N = 69).
We should note that the function of these repertoire entropy parameters is to directly quantify the degree/rate of novel or conserved states within a system/ call collection. This is not equivalent to detecting vocal convergence/divergence between individuals. For example, two individuals may exhibit between them distinct or similar sets of calls (acoustically divergent or convergent, respectively) and show the same level of self-organization in either case, namely, when calls of similar/different acoustics within individuals occur at similar rates. Vocal convergence/divergence (and acoustic entropy) is tied to raw acoustics of single calls, whereas repertoire entropy is tied to variation regimes of call collections.
For 'layperson' examples of how these entropic measures can be applied across systems, please see Supplementary Data 5 for flip-a-coin examples and see ref. 87 for examples pertaining to household electric spending, solar flares and bike-sharing services. To consult the open-access 'white paper' dedicated to the comprehensive description and technical explanation of these measures, please see ref. 88 . MATLAB/Octave functions are provided therein for the application of these measures across natural and artificial systems (in addition to the R code notebook provided in Supplementary Data 5 as applied to our datasets).

Data analyses: linear mixed-effect models.
After the entropy measures were estimated for each set, we studied the effect of sex (two levels: female, male), age-sex class (five levels: infant, adolescent, adult female with infant, unflanged male, flanged male), island (two levels: Bornean, Sumatran), context (five levels: towards: observers, animals, humans (non-observers), no apparent danger) on the three entropic measures for maximum frequency and duration (thus, six models in total), including them as fixed control factors. Orangutan density was included as our main fixed factor of interest in all models. We included individual identity as a random effect to control for repeated measures. We implemented our linear mixed models (LLMs) (test model terms: Satterthwaite; model type: III sum of squares) using open-source JASP 117 (v. 0.14.1). Results were plotted using R 118 and 'ggplot2' 119 and 'gridExtra' 120 packages.
Population was not included as a random effect because our design did not include repeated measures at the population level, because the complete resident population at each site was sampled (instead of partial pooling per population) and because the variable is categorical with few levels (that is, six), under which case the variable should be included as a fixed effect instead of random. However, population fully co-varies with orangutan density-the main variable of interest. Orangutan density does not vary within population. Including population would not contribute, therefore, (as random or fixed effect) to control for sampling bias, and its inclusion would spuriously reduce statistical power. (Force-inserting the variable as a fixed effect in our model leads JASP to produce error warnings and abort the operation.) It should be noted that under general statistical heuristics, there is a difference between clear hypothesis testing (X affects Y, hypothesized in advance)-as we do here-versus pure exploratory approaches. Hypothesis testing should seek to avoid model complexification, and this is also the reason why no interactions were included in our model; our working hypothesis did not rely on interactions between fixed factors for verification. Dosed and well-motivated addition of supplementary variables and interactions could be a helpful alternative to understand the phenomena under observation, but only in purely exploratory approaches.
Maximum frequency and duration constituted orthogonal, non-correlated variables (Spearman's rho = −0.017, P = 0.221); however, because they were extracted from the same call event, they should be treated as non-independent. Given that both entropic emergence/self-organization and complexity were in turn derived from both maximum frequency and duration, altogether, this required the results of our linear mixed models to be adjusted for false discovery rate (FDR). To this end, we applied the Hochberg correction procedure 121 , 'arguably still the most widely used and cited method for controlling the FDR in practice' 122 . To compute adjusted P values using this correction, we used 'p.adjust {stats}' in R.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data and Code Availability
All data and code needed to evaluate the conclusions in the paper are present in the paper and/or the electronic supplementary materials (Supplementary Data 1-5). Additional data may be requested from the authors. Corresponding author(s): Adriano R. Lameira Last updated by author(s): Jan 27, 2022 Reporting Summary Nature Portfolio wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Portfolio policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted

Software and code
Policy information about availability of computer code Data collection N.a.

Data analysis
Raven interactive sound analysis software (version 1.5, Cornell Lab of Ornithology, Ithaca, New York) R 4.0 and R Studio For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.