## Introduction

Interpreting gregarious behaviour in terms of cooperative strategies that maximize individual fitness, Hamilton1 developed the theory of kin selection. This theory uses the concept of inclusive fitness to explain the evolution of social organization and cooperation where individuals indirectly enhance their fitness through positive effects on the reproduction of relatives. However, individuals can also derive benefits from associations with unrelated individuals where cooperation is conditional on the behaviour of the companion (i.e., reciprocity2,3), always yields the highest benefit (i.e., mutualism4,5) or results in shared fitness advantages from helping increase group size (i.e., group augmentation6,7) which, for example, could lead to more effective group defense. In fluid aggregations, such as herds or flocks, benefits to the individual alone (e.g., reducing personal predation risk at the expense of other group members) may drive the tendency to associate with conspecifics8,9. Thus, resolving the genetic relationships among group members is central to understanding the advantages of group living, the emergence of cooperative behaviour, and the evolution of social organization. The recent re-emergence of arguments for group selection theory, where kinship plays a minor role in social evolution10,11, and the debate these arguments have elicited12, as well as the growing evidence for culture in non-primate species13,14,15 (defined as the acquisition or inheritance of knowledge or behaviours from conspecifics through social learning13), has further heightened the interest in the role of kinship in the characteristics, dynamics and function of groups in social species.

Many aspects of beluga whale (Delphinapterus leucas) behavioural ecology, including their highly gregarious behaviour16 and sophisticated vocal repertoires17,18, associated with a diverse suite of interactive behaviours19,20, suggest that this arctic cetacean lives in complex societies. Belugas exhibit a wide range of grouping patterns from small groups of 2–10 individuals to large herds of 2,000 or more, from apparently single sex and age-class pods to mixed-age and sex groupings, and from brief associations to multi-year affiliations16,21,22,23. This variation suggests a fission–fusion society where group composition and size are context-specific, but it may also reflect a more rigid multi-level society comprised of stable social units that regularly coalesce and separate. The role kinship plays in these groupings is largely unknown.

It has been postulated that beluga whale group structure centres around females with their calves of different ages16,21 and is similar to the group structure in killer whales (Orcinus orca) and some other odontocete whale species21,24 that primarily comprise closely related individuals from the same maternal lineage25,26,27,28,29. Group structure is quite different in other odontocetes, such as the bottlenose dolphin (Tursiops spp.), where grouping patterns vary from all-male alliances and female bands to mixed groups of varying size and stability30,31. While matrilineal affiliations exist, bottlenose dolphin groups are not strictly matrilineal and the extent of kinship within groups varies dramatically32,33,34.

Genetic studies have revealed significant geographic partitioning of mtDNA lineages in beluga whales35,36,37,38 and found that relatives sometimes travel together within large migrating herds or occur in close temporal proximity throughout the migratory cycle more frequently than expected by chance39,40. These findings, in concert with the discovery of closely related individuals returning to the same summering location years and even decades apart, are compelling evidence of natal philopatry to migration destinations where the strong mother–calf bond may facilitate the cultural learning of migration routes40. However, it must be noted that herds also contain large numbers of unrelated individuals39,40 and the potential preferential association of matrilines (or even just close kin) beyond mother–calf pairs within these large seasonal aggregations has not been investigated. Furthermore, there is almost no information on the possible role of kinship in smaller groupings. And yet, the model of a stable matrilineal group as the cornerstone of beluga society is often used as a social framework to interpret other aspects of beluga whale behaviour and ecology24,41,42.

In this study we used field observations, mtDNA profiling, and multi-locus genotyping of beluga whales to address fundamental questions about beluga group structure, and patterns of kinship and behaviour that provide new insights into the evolution and ecology of social structure in this Arctic whale. The study was conducted at ten locations, in different habitats, across the species’ range, spanning from small, resident groups (Yakutat Bay) and populations (Cook Inlet) in subarctic Alaska to larger, migratory populations in the Alaskan (Kasegaluk Lagoon, Kotzebue Sound, Norton Sound), Canadian (Cunningham Inlet, Mackenzie Delta, Husky Lakes) and Russian (Gulf of Anadyr) Arctic to a small, insular population in the Norwegian High Arctic (Svalbard) (Fig. 1).

We investigated whether there are basic types of groupings and association patterns in beluga whales that are consistently observable within and across populations and habitats, and if so, to what degree are they kin-based. The following seven hypotheses were tested: (H1) beluga whales form a limited number of distinct group types observable across varied locations and habitats; (H2) certain behaviours are more prevalent in particular group types; (H3) beluga whale groups are predominantly kin-based, comprising high proportions of close relatives, (H4) beluga whale groups are matrilineal, comprised of close maternal relatives, (H5) females are more related than males within groupings; (H6) larger herds are composed of multiple distinct matrilineal groupings; and (H7) adult-only groups are exclusively comprised of males. We compare our findings with current knowledge on social structure of other odontocete whales. We interpret our results in the context of existing theories on the evolution of social organization and discuss their implications for beluga whale management in a changing Arctic including how social disruption might influence culture and population recovery.

## Results

### Grouping patterns and behaviour

Seven distinct group types were identified in beluga whales, two of which fell under the definition (see “Methods”) of a herd (i.e., > 50 animals) and five that were defined as social groups (i.e., ≤ 50 animals; Table 1). These group types were observed repeatedly within and across multiple locations and habitats (Table 1). The five types of social group were: (A) adult–calf dyads, (B) groups consisting only of adults with calves, (C) groups of juveniles only, (D) groups of adults only, and (E) mixed-age groups. The two types of herd were (F) adult-only herds and (G) mixed-age herds. We further distinguished two other grouping types from the mixed-age herd type: (H) daily aggregations, and (I) multi-day aggregations. These latter groupings were essentially mixed-age herds that were not under continuous observation over the period of tissue sampling (i.e., a single day and several days, respectively) and therefore simultaneous association among all sampled whales in the grouping could not be affirmed. A possible sixth social group type was observed—an adult with two calves of differing ages—tentatively termed a triad, though this possible group type was observed only once (group type A1, Table 1). At one location, Kasegaluk Lagoon (Fig. 1), whales that were in a herd were slowly driven in a set direction by a number of small boats for several hours prior to observation and sampling which may have influenced herd composition. However, we assume that these mixed-age herds were predominantly natural associations.

The most commonly observed behavioural category was Travel followed by Social, Milling and Other. It should be noted that the diversity of behaviours observed within groups was likely influenced by the amount of time the whales were under observation; in some cases, groups of whales were under observation for only a few minutes before biopsy sampling commenced, so were typically only observed travelling. Similarly, whale behaviour in some instances was clearly influenced by human activity; whales at Kasegaluk Lagoon were driven into the lagoon by hunters. Undisturbed social groups were typically observed performing behaviours that fell under a single category, primarily Travel or Social, with some recorded in the Other behavioural category that likely reflected molting or natal care (Fig. 2a, SI Appendix 1). Herds and large aggregations tended to conduct a wider array of behaviours, hence scoring highest D values (Fig. 2b). We found limited evidence of statistically significant differences in behaviour among groupings at the primary behavioural category level. Group type A exhibited significantly higher frequencies of Other behaviour than most other group types (Fisher’s exact-test p = 0.13–0.004) that involved very close associations between the adult and calf, suggestive of natal care (SI Appendix 1). Aggressive behaviour to non-group members, suggestive of dominance behaviour, was observed only in group type D, while interactions suggestive of likely play behaviour was observed only in group types C and E (Table 1).

### Molecular genetic analysis

PCR-based sex identification revealed that group Type E and Type G typically contained both males and females. By contrast, Type D and Type F were almost exclusively comprised of males (Table 1). The one exception was an adult-only herd observed in Kotzebue Sound, Alaska, where the sex identification assay revealed that 3 of 41 whales were female. For all Type A groups, the adults were determined to be females.

Beluga whale groups were frequently composed of more than one maternal lineage (Figs. 3, 4). Other than adult–calf dyads (see below), groups of beluga whales even with as few as 2 individuals older than dependent calves were found to contain whales that had different mtDNA lineages (Fig. 3). The number of distinct mtDNA lineages found within social groups ranged from 1 to 4 (Fig. 3), while the number observed within herds and aggregations ranged from 3 to 12 (Fig. 4).

All but one of the adult–calf dyads were determined to be mother–calf dyads; the single remaining dyad did not match at one locus, and ml-relate conservatively estimated the relationship to be a half sibship (Fig. 3). However, when a genotyping error rate of 0.05 was used, a parent–offspring relationship was found to be more likely. Apart from mother–calf dyads, beluga whale groupings of almost all types contained a mixture of closely related and either distantly or unrelated individuals (Figs. 3, 4). Large herds and aggregations were particularly dominated by distantly or unrelated (U) pairings (Fig. 4b) even though they also contained first (PO) and second (FS, and HS) order relationships.

Due to statistical power considerations, the demrelate analysis was conducted only on migrating herds and seasonal aggregations where the sample size exceeded 14 individuals. The analysis revealed that for the relatedness estimator Mxy the observed frequencies of FS and HS in these larger groupings were significantly higher than expected frequencies of sibships in a randomly generated population with the same allele frequencies and with the same sample size (Table 2). By contrast, the observed frequencies of sibships using the rxy estimator were often lower than random expectations (Table 2). However, further tests on sample sets with artificially inflated frequencies of siblings revealed that the rxy estimator consistently overestimated expected frequencies (Supplementary Table S1 online).

While the herds and aggregations typically comprised multiple mtDNA lineages, there was no clear evidence that each lineage represented an extended matrilineal family. For most groups, average relatedness among group members (both within and between lineages) was low, ranging from $$\overline{r}$$ = − 0.03 to 0.04. While $$\overline{r}$$ within mtDNA lineages tended to be higher, this was not statistically significant in most herds analyzed (p > 0.05; Fig. 5). The one exception was the ice-entrapped herd involved in Husky Lakes adjacent to the Canadian Beaufort Sea ($$\overline{r}$$within = 0.15 vs. $$\overline{r}$$between = − 0.03, p < 0.01) in which several mother–calf dyads were sampled.

Comparing patterns of pairwise r within and between the different types of whale groups revealed that apart from adult–calf dyads, beluga whale social groups, herds, and aggregations had mean and median pairwise r values close to zero (Fig. 6a). Furthermore, we found little evidence of significant differences in $$\overline{r}$$ among grouping types, apart from all comparisons involving adult–calf dyads ($$\overline{r}$$QG = 0.468 p = 0.0001) and some comparisons for all-adult herds ($$\overline{r}$$QG = 0.022 p = 0.0001 to 0.770). Behaviour did not appear to be related to median relatedness within groups (t-test p > 0.05; Fig. 6b), although groups observed conducting Social behaviour had significantly lower mean relatedness ($$\overline{r}$$QG = − 0.082) than those involved in Travel ($$\overline{r}$$QG = 0.065; p = 0.03). This was driven primarily by the adult–calf dyads all of which were recorded as travelling (Fig. 6b).

Beluga whale networks based on genetic relatedness were characterized by long paths that connected through a few central individuals (Fig. 7). Both the automatic and manual thresholds of most networks revealed that few social groups or herds formed a “connected network”, which is a network that consists of a single component where all nodes (individuals) could reach every other node via some path (Fig. 7). Most individuals within a beluga whale social group or herd were directly linked to just one or two close relatives ($$\overline{k}$$ = 1.40–2.85) who in turn, were linked to a few other whales, thus forming long, interconnected paths (Fig. 7). Regularly, a smaller number of animals had links to more individuals (k = 3–10) while some individuals were not connected (k = 0) to other whales in the group or herd at all. The betweenness centrality value for individual nodes varied greatly within most networks (e.g., bc = 0–279), further indicating that beluga whale networks were not fully connected and generally comprised long paths that interconnected through a few individuals. The clustering coefficient of individual nodes was generally low ($$\overline{C}$$ = 0.06–0.5), indicating that in most cases where an individual was closely related to two or more other individuals, those other whales were not closely related to each other. There were exceptions to this, as can be seen from the highly reticulated elements of some of the networks (e.g., Fig. 7a, c). These highly connected network ‘neighbourhoods’ likely indicated multiple familial relatives.

There was limited evidence of structuring within the genetic networks based on maternal family lines. While some networks contained elements where a number of linked individuals had the same mtDNA haplotype, in most cases links occurred among individuals who possessed different haplotypes (Fig. 7). This was even the case in the highly connected familial neighbourhoods mentioned above indicating paternal rather than maternal relatedness. The network properties of betweenness centrality (bc), degree (k), and clustering coefficients (C) were rarely found to differ among haplotypes within networks (t-test and one-way Anova, p = 0.01–0.72). There was also no apparent difference between the properties of males and females in the networks (e.g., Fig. 8b). For example, bc did not differ significantly among the sexes (p = 0.14–0.8) and neither did C (p = 0.16–0.98) or k (p = 0.16–0.91).

There were a number of herds (Kasegaluk Lagoon and Husky Lakes) where age estimates were available for all individuals. Age categories did not significantly influence the location or other network properties of individuals within the network (anova, p = 0.08–0.76). Juvenile and subadult whales were as likely to be at the center or periphery of the network (bc), have as many links to other individuals (k), and form reticulated clusters with multiple whales that were also related to each other (C), as adult animals were (e.g., Fig. 8c). In the one instance where multiple calves were sampled, the calves had central positions within the network ($$\overline{k}$$ = 3.33) and relatively high clustering ($$\overline{c}$$ = 0.61), and betweenness ($$\overline{bc}$$ = 21.33) compared to the entire network, though not significantly so (p = 0.17–0.64, Fig. 7e).

Interestingly, the networks of all-male herds were quite similar to those of mixed herds (Fig. 7c). The male herds were generally comprised of only adult animals, with one exhibiting greater connectively among individual whales than all other herds sampled in that population (Fig. 7c).

Network analysis of social groups was not very informative. However, in one location, (Cunningham Inlet, Canada), where several groups within a large summer aggregation were biopsied over a period of 5 days, individuals observed associating in a distinct group typically did not cluster together within the network (Fig. 9a). Conversely, closely related whales that were not observed associating, were often sampled nearby in the same day or on subsequent days (Fig. 9b).

In some locations whales that were observed associating were caught, satellite tagged, and subsequently tracked, providing an opportunity to assess seasonal movement and association patterns in relation to genetic relatedness. Five adult male belugas tagged from the same herd in Kasegaluk Lagoon had similar seasonal movements while the tags transmitted (n = 13–104 days), and some appeared to travel as a group for periods of up to 29 days43. Interestingly, none of these whales were closely related. In Svalbard, three adult male whales travelling together were tagged at the same time and they moved together throughout the subsequent months (for as long as the tags transmitted—n = 52–120 days)44. Again, they were not closely related. Finally, three young adult males that were tagged at another location in Svalbard, this time over three successive days spent most of the time together, occasionally splitting up only to come back together, over the timeframe that the tags transmitted (n = 10–63 days) (Lydersen personal communication); the individuals were not closely related.

## Discussion

Beluga whales formed a variety of group types that were consistently observed in multiple populations across the species range and certain behaviours were associated with grouping type (Table 3). Similar grouping patterns have been observed by others16,21,22,23,45,46,47 and a diverse range of behaviours have also been described for beluga whales in the wild. A number of these behaviours have been linked to specific vocalizations20,48,49, have been found to be influenced by environmental conditions and spatiotemporal variables21,50,51,52, or found more commonly associated with specific group types21,41,49. However, until now there have been limited formal analyses of the relationship between behaviourss, group type, group dynamics, and kinship.

Our genetic analysis revealed several unexpected results. Beluga whale groupings (beyond mother–calf dyads) were not usually organized around close maternal relatives. The smaller social groups, as well as the larger herds, routinely comprised multiple matrilines. Even where group members shared the same mtDNA lineage, microsatellite analysis often revealed that they were not closely related (but see Husky Lakes), and many genealogical links among group members involved paternal rather than maternal relatives (Figs. 3, 4, 7). These results differ from earlier predictions that belugas have a matrilineal social system of closely associating female relatives21,24. They also differ from the association behavior of the larger toothed whales that informed those predictions. In ‘resident’ killer whales, for example, both males and females form groups with close maternal kin where they remain for their entire lives25,28,53. Both long-finned (Globicephala melas) and short-finned (G. macrorhynchus) pilot whale societies are structured along similar lines26,54, while female sperm whales (Physeter macrocephalus) form stable multi-generational matrilineal social units27,29.

In several cases males held central positions within reticulated ‘neighbourhoods’ of networks, indicating that they were closely related to multiple group members, some of whom were likely their offspring or grandoffspring (Fig. 8). The occurrence of paternal relatives within the same grouping was even more evident in the all-male herds (Fig. 7c, f). These findings indicate that male belugas may exhibit high fidelity to a herd for much of their lives, often associating with adult offspring of both sexes. Furthermore, males may be highly philopatric to their natal herd and thus associate with parents and grandparents.

The brief periods of observation of most beluga groups in this study combined with the much longer periods tracking satellite tagged whales43,44 provided important insights into the stability and dynamics of grouping patterns. Close relatives did not always associate in a group, but the fact that they could be in another group close by was supported by field observations where individually recognized whales were observed moving between groups, and even group types, over a few days, and in some cases a few hours (O’Corry-Crowe, field notes). By contrast, unrelated whales can spend long periods of time and cover considerable distances together, and sometimes split up only to come back together.

The relationships we found between group type, behaviour, dynamics and kinship indicate that the driving forces behind social structure in beluga whales are complex. Grouping patterns may depend on the social context (i.e., who is present), as has been proposed for bottlenose dolphin fission–fusion societies30, but also on life history and the behavioural/ecological context including: breeding, migration, feeding, vigilance and natal care. To what degree these groupings are cooperative or selfish is not clear. More remains to be learned about the longevity, stability and kin composition of beluga groupings before clear hypotheses about inclusive-fitness benefits versus non-kin-based advantages of group membership can be tested. Recently, female kinship has been identified as central to social complexity in cetacean species with kin selection as a primary evolutionary driver of cooperation, life history and culture55. However, there are already indications that more than one evolutionary mechanism may be involved in beluga whales. For example, the small groups comprising two or three large males may be similar to male alliances in bottlenose dolphins30,56 or coalitions in lions (Panthera leo)57,58 and chimpanzees (Pan troglodytes)59, where group members cooperate primarily to secure reproductive benefit. If these beluga affiliations are cooperative in nature, the finding that group members tended to be unrelated (Fig. 3) indicates that direct fitness benefits in terms of improved reproductive success (and possibly survival) to group members may be garnered via reciprocity2,3, mutualism4 and/or manipulation5.

Several genetic studies have revealed a strong tendency for beluga whales to remain in their natal subpopulation or population35,36,37,38,39,40. When viewed with the current study’s findings this provides more insight into the social, ecological and demographic scales at which beluga whale societies may be operating. We propose that beluga whales, across a wide variety of habitats and among both migratory and resident populations, form communities of individuals of all ages and both sexes that regularly number in the hundreds and possibly the thousands. Beluga whales may form a wide variety of social groupings within these communities, dependent on immediate social and ecological contexts, that may include seasonal sexual segregation. At larger spatiotemporal scales there is strong philopatry or fidelity by both sexes to these mixed-age and—sex communities.

We have shown that beluga whale societies pose several challenges to emerging explanations for the evolution of sociality, culture and unique life history traits in toothed whales. The stable matrilineal societies of killer, sperm, and pilot whales seem to fit neatly with the theory of kin selection, anchored in inclusive fitness benefits gained by associating and cooperating with close relatives55. Inclusive fitness also underpins evolutionary explanations for a rare phenomenon in nature: menopause, which has only been recorded as prevalent in a few vertebrate species that includes humans and four toothed whale species (including beluga whales)60,61,62. New research on killer whales, for example, found that assistance provided by post-reproductive grandmothers improved the survival of their grandoffspring63. Some of these matrilineal whales may form matriarchal societies where older females have substantial influence over kin as seen in other, long-lived matrilineal species including African elephants (Loxodonta africana)61,64,65,66. Stable, multi-generational matrilineal whale societies also seem ideal environments for the emergence of cultures because of the inclusive fitness benefits of transmitting behavioural traditions and ecological knowledge to close kin13, which in turn may influence gene evolution and even speciation67,68.

On the face of it, beluga whales seemed to fit this model; they form multi-generational groupings16,21,23, females have long post-reproductive lifespans46,62, and the prolonged period of maternal care seems the likely conduit for social learning and the emergence of migratory culture40. Our study did find that close kin, including close maternal kin, regularly interact and associate. However, it also revealed that beluga whales frequently associate and interact with more distantly related and unrelated individuals. Inclusive fitness benefits alone seem insufficient explanations for the evolution of group living in beluga whales. The frequency with which adult female belugas associate, and presumably cooperate, with non-kin also complicates studies of menopause where contributing to the fitness of kin (along with a long lifespan) is considered the basis for its evolution61,63,69,70. Our findings indicate that evolutionary explanations for group living and cooperation in beluga whales must expand beyond strict inclusive fitness arguments to include other evolutionary mechanisms.

Belugas likely form multi-scale societies from mother–calf dyads to entire communities. The longevity and stability of a grouping, and the adaptive advantages to the individual of being a group member, likely differs at these different scales. While membership in social groups can be highly dynamic, both males and females appear to be highly faithful to their community. These behaviours in concert with a long lifespan (≥ 70 years)46 create an environment where frequent interactions may occur, and long-term relationships may develop, among both kin and non-kin of differing ages and both sexes. In such a social setting inclusive fitness benefits, such as the care of non-descendent young and group leadership, may maintain cooperation among close kin (kin selection). Similarly, non-kin could benefit directly from the vigilance of others or indirectly from the sharing of ecological knowledge (mutualism). For example, beluga whales have been observed to rapidly respond en masse to the presence of a predator, including dispersing from a killer whale attack site for several days71 and actively encircling and encroaching on a polar bear until it swam ahore72. Furthermore, frequent interactions among non-kin over a long life may provide ample opportunities for receiving delayed benefits from cooperative exchanges (reciprocity). The prevalence of reciprocity in animal societies, however, is widely debated5,73 and would be challenging to study in wild belugas. Unlike the matrilineal whales, perhaps, the regular occurrence of adult males as well as females in mixed migrating herds and summer aggregations, when taken with evidence of both male and female philopatry35,36,37,38,39,40 suggests that elders of both sexes may be important repositories of social and ecological knowledge in belugas. In these communities social learning may occur among non-kin as well as kin, facilitating the emergence of cultures (e.g., the development and perpetuation of migratory circuits and the use of traditional feeding areas) that are beneficial to all members of the community. Similar arguments may apply to the evolution of social structure in other cetacean species that form fission–fusion societies and/or are non-matrilineal and where groups comprise both kin and non-kin including bottlenose dolphins30, northern bottlenose whales (Hyperoodon ampullatus)74 and possibly Baird’s beaked whale (Berardius bairdii)75.

From these perspectives, beluga communities have similarities to human societies where social networks, support structures, cooperation and cultures involve interactions between kin and non-kin76,77. An analysis of human societies may also be instructive in understanding menopause in beluga whales. Others have argued that menopause can only evolve when inclusive fitness benefits outweigh the costs of halting reproduction early61. However, unlike killer and pilot whales, but like some human societies76, beluga whales do not solely (or even primarily) interact and associate with close kin. Nevertheless, it may be that their highly developed vocal communication19,42 enables beluga whales to remain in regular acoustic contact with close relatives even when not associating together, and that over the course of a long life, individual belugas preferentially assist close kin when they do encounter them, and in the case of older post-reproductive females preferentially assist maternal kin.

Matrilineal societies pose unique challenges for species management where social and cultural disruption due to the loss of adult females may have far reaching consequences to populations beyond the immediate impacts of lowered productivity78,79. In beluga whale societies this may also hold true, but, in addition, mortality of older males as well as older females may also increase the risk of losing important ecological and social knowledge. Furthermore, cultural conservatism may slow the recolonization of areas formerly occupied by beluga whales, may increase species vulnerability to localized threats (e.g., decline of a preferred food source), and slow behavioural and ecological adaptation to ecosystem change.

This study provides new insights into the fundamental nature of beluga whale social structure and challenges prevailing hypotheses about social organization, kinship and the selective advantages of group living in this species. A more detailed exploration of many of the study’s findings can be found in Supplementary Information online. Future research should focus on competition, conflict and selfish vs cooperative behaviour in beluga whale societies, expand genetic studies to identify more distant (≥ 3rd-order) relatives, and investigate the mechanism and the significance of the spread of cultural innovations, especially as related to population responses to climate change.

## Methods

Data on the size and composition of beluga whale groups were collected at several locations across the species range (Fig. 1). Data on behaviour were collected at a number of the locations using focal-follow sampling or opportunistic observations of animals conducted from shore, small boats or an observation tower prior to tissue sampling. Observations typically lasted from several minutes to a few hours. Detailed information on whale behaviour was recorded and classified into 4 broad categories: Travel, Mill, Social, and Other based on O’Corry-Crowe et al.80 (see SI Appendix 1 for details), and an index, D, was developed to quantify the diversity of behaviours recorded within beluga whale groupings: D = 1 − ((∑(di2)(n))/n − 1), where d is the proportion of ith behavioural category observed and n is the number of times the behaviour was observed. At a number of locations, the data collected on group size, composition and behaviour were more limited. In some of these locations (Table 1) caution is required in interpreting field data on group characteristics because whale behaviour was likely influenced by human activities (i.e., hunting) at the time of observation.

Animals were in a group of some description if they were aggregated in space (i.e., non-uniformly distributed) at the time of observation. A distinction was made between large, loose groupings of whales, termed ‘herds’, and smaller, more compact groupings of individuals termed ‘social groups’. The former comprised groups of over 50, and as many as fifteen hundred, loosely associated animals seen in bays, inlets or estuaries. Social groups comprised between 2 and 50 whales in close association, (defined as within 12 m or up to 4 body lengths of other group members), in which physical contact between animals was common. Although the distinction between social group and herd is based on our field observations of whale behaviour and group size, the size cutoff is somewhat subjective. Similar sizes have been reported for our definition of a social group by others16,19,21. The smaller social groups often occurred within the larger herds. In some instances, longer-term temporal patterns of grouping behaviour were also available from a series of satellite-linked telemetry studies of beluga whale movements and dive behaviour43,44,81,82 that spanned periods from a few days to several months.

Tissue samples were collected from: (a) free-swimming beluga whales via remote biopsy, (b) temporarily captured whales during tagging operations, or (c) harvested whales during biological sample collection between 1988 and 2008. Details on tissue collection and preservation methods can be found elsewhere35,83,84. Total DNA was extracted from each tissue sample by established protocols and screened for variation within 410 bp of the mtDNA control region and eight independent microsatellite loci (see35,40,85 for methodological details). The sex of each sample was determined by PCR-based methods86, and replicate genotyping, sequencing and sex determination was conducted to estimate error rates.

Earlier studies revealed that the eight hypervariable microsatellite loci were highly informative in determining individual identity, assessing gene flow and population structure, and estimating first and second-order relationships in beluga whales40,84,85,87. These earlier studies also determined minimum allowable thresholds for missing data and genotyping errors and found that individual identity required a minimum of four of these loci87 while accurate relatedness (r) estimation required a minimum of six loci genotyped per individual40. In the current study, we continued such tests by conducting analyses on datasets with individuals scored at ≥ six loci, ≥ seven loci, and all 8 nuclear loci. With data from known cow–calf pairs, these analyses showed that individuals scored at a minimum of 6 loci provided reliable estimates of high relatedness and close genealogical relationship. In the network analysis (see below), we decided to raise this threshold to ≥ seven loci in order to avoid possible spurious network edges due to lower confidence in very low levels of estimated relatedness.

The programmes coancestry88 and ml-relate89 were used to estimate r and genealogical relationship among individuals based on the microsatellite data. Coancestry implements 7 estimators of r that use multilocus genotype data. This programme uses simulations of genotypic data of pairs of individuals with one of four predefined relationships: parent–offspring (PO), full-sib (FS), half-sib and grandchild–grandparent (HS), and unrelated (U), to determine which estimator is the best for a particular study. Using known population allele frequencies to simulate sets of paired genotypes that fit all four relationship categories (i.e., PO, FS, HS and U), which were similar in sample size to our larger group sample sets (n = 20–40), we found that of the seven r indices compared, the dyadic likelihood estimator, rwang90, and the moment estimator, rQG91, performed best and thus these two estimators are presented here. ml-relate, which uses a maximum likelihood approach to estimate the likely relationship between pairs of individuals for the same four relationship categories: PO, FS, HS and U, facilitated comparisons of r and relationships for each pair of individuals.

coancestry was used to test for differences in average r among groupings. Specifically, differences in mean r among subgroups of individuals that had the same versus different maternal lineages (i.e., mtDNA haplotypes) were tested to determine whether larger aggregations of beluga whales were composed of matrifocal family units. The observed differences were compared to a distribution of differences based on 50,000 randomized bootstrap runs of the data. Statistical hypothesis tests and descriptive statistics summarizing the patterns of r within different groupings were conducted in Excel 2016. The combined mtDNA-microsatellite analyses also allowed inferences of paternal relatedness when high r and close genealogical relationships (PO, HS, and HS) were estimated among individuals with different mtDNA haplotypes.

Because large groups of animals (e.g., herds, flocks) will likely contain a certain proportion of close relatives without necessarily indicating behavioural preferences for associations among close kin the R package demerelate v. 0.9-392,93 was used to investigate whether beluga whale herds and seasonal aggregations had more close relatives than would be expected by chance. The programme was designed to assess FS, HS and U frequencies, but not PO frequencies (Kraemer personal communication). Therefore, in order to exclude PO pairs from the empirical datasets prior to running demerelate, ml-relate was used to independently estimate relationship categories (i.e., FS, HS, U and PO: see above), and then one individual from each PO pair was excluded. In the demerelate analysis, two estimators of pairwise relatedness were compared, the genotype-sharing Mxy94 as recommended by the programme’s authors and the widely used rxy (91; note this is the same as rQG in coancestry). For each estimator, populations of randomized offspring and un-related individuals are generated from a reference population in order to calculate threshold values for FS, HS and U relationships. The reference population in each case was the source population for the particular beluga group. χ2 tests were then used to compare observed FS and HS frequencies to expected frequencies among a number of individuals (of the same sample size as the observed data) that were randomly generated from the allele frequencies of the reference population.

Network analysis was used to investigate the patterns of genetic relationships among all the individuals sampled within a social group or herd. Using the programme EDENetworks v. 2.1895, we built networks based on estimates of individual pairwise relatedness that were converted into genetic distance matrices. Networks were then compared based on the two estimators of r that performed best for our investigation (rwang and rQG, see above) and automatic thresholding was used to identify the point at which further removal of links (termed edges) fragmented the network into small components. This typically incorporated the majority of close relationships (PO, FS, and HS) within the social group/herd into the network. These thresholds were then manually adjusted to the point where links between unrelated (U) individuals (termed nodes) were excluded. ml-relate analysis (see above) was used to identify the four relationship categories. The analysis provided descriptors of overall network topology and information on the network properties of individual whales, including: (1) degree (k)—the number of edges connected to a node; (2) betweenness centrality (bc)—the number of shortest paths running through that node; and (3) clustering coefficient (C)—the ratio of the existing number of connections between a node’s neighbours to the maximum number possible. This programme also enabled us to investigate the genetic networks in terms of other properties of the individuals within those networks, including their age, sex and maternal lineage (i.e., mtDNA haplotype).

All activities involving live whales were permitted (USMMPA #782-1719-06, NARA #2013/36156-2, GOS #2013/00050-42 a.512, NOAA782-1438) and approved by the relevant authorities in each country: the US National Marine Fisheries Service Office of Protected Resources, the Russian Federation Marine Mammal Permits Office, the Department of Fisheries and Oceans, Canada scientific licenses, and the Norwegian Animal Care Board. All activities were performed in accordance with these guidelines and regulations.

### Statement on the study of live animals

All activities involving the sampling of life animals were carried out in accordance with relevant guidelines and regulations (see “Methods” for details).