Main

How can collective property rights evolve to manage otherwise open-access resources? Elinor Ostrom provided a fundamental insight when she identified eight design principles typically associated with groups that successfully collectively govern natural resources without reliance on individual private property or top-down government regulation1. Central among these design principles are (1) clearly defined access rights that regulate who can access the resource and (2) enforced use rights, specific to the resource’s ecological and socio-economic context, that regulate use patterns among sanctioned users1,2. Together, these two principles delineate how a resource can be used and by whom, thereby forming the conceptual basis for collective property rights3,4,5.

Nevertheless, over the past three decades, the design principles have been subject to extensive debate and elaboration, particularly regarding the extent to which the different design principles causally drive sustainable outcomes, the relative importance of different principles in different resource systems, the interactions between different principles and the neglected domains of analysis such as power imbalances6,7,8,9,10,11. These concerns have highlighted the need for scholars and practitioners alike to develop a more dynamic understanding of the design principles and collective property rights9,12,13 that examines both their cultural evolutionary origins14 and the coevolutionary interactions that contribute to their emergence and stabilization10,15.

A focus on dynamics, however, has not been entirely absent from previous research; for example, economists and historians have long argued that contests over resources have been a fundamental driver in the evolution of property rights16,17,18,19. Yet, this work has primarily focused on the collective-to-private property rights transition rather than the emergence of collective property rights in the first place20,21,22,23,24 (but see ref. 25) and on how groups can stabilize cooperation in managing a resource over which they already have exclusive access2,26,27,28,29. In contrast, evolutionary researchers have focused on a different side of this puzzle by identifying the environmental conditions that favour group territoriality30, investigating how competition between groups in structured populations can lead to novel evolutionary outcomes via multilevel selection31,32,33,34, developing models on how humans transmit culture35 and understanding the evolutionary consequences of our ability to modify our natural environment36. Complementing both of these streams of research, empiricists have focused on how institutions can contribute to intergroup natural resource-based conflicts37 and how such conflicts can affect institutional performance38.

However, so far, little direct attention has been paid to the processes whereby human groups establish and stabilize exclusive access over previously open-access resources and how this group ownership affects the evolution of sustainable use rights. This represents a substantial gap in our understanding of collective property rights for at least three reasons. First, collective property rights emerging from open access are probably millennia old, as hunter-gatherer societies often have some form of collective property rights over natural resources39,40, which was presumably preceded by relative open access. Thus, the development of institutions regulating collective ownership of natural resources represents a significant feature of our evolutionary history. Second, understanding collective property rights and design principles as dynamic evolving systems rather than static entities is a central challenge within sustainability science9,41,42. Finally, knowledge of how collective property rights emerge and are maintained has fundamental real-world applications for natural resource governance. Indeed, many groups are still struggling today to impose collective property rights over de facto open-access resources43.

Building on recent work in cultural evolution32,33,34,44,45,46, this paper presents a set of models (agent-based and difference equation models) of how collective property rights over natural resources can emerge, the challenges groups face in establishing them and the precise role these rights play in sustainable governance. Our approach contributes to existing work in sustainability science in several ways. First, it develops a formal theoretical model that can easily be adapted to other resources and social contexts9. Second, our dynamic modelling approach allows us to examine coevolutionary interactions between design principles10. Finally, such general models help empirical researchers develop data collection protocols that target specific variables and causal processes, thereby reducing research costs and spurious positive empirical findings47.

Brief model description

Background

Before presenting the theoretical model, we outline the empirical and ethnographic observations from our long-term field site on the island of Pemba, Tanzania, that motivated the specific design of our model.

In 2010, a REDD+ (reducing emissions from deforestation and forest degradation) readiness project was introduced in 18 wards on the island, granting communities communal property rights over their forests through Community Forest Management Agreements (CoFMAs) that legally demarcated user groups (access rights) and allowed users to specify harvest regulations (use rights)48.

By the end of the REDD+ readiness project (2015/2016), communities ‘without’ CoFMAs voiced protestations to the government, claiming that individuals from communities ‘with’ CoFMAs were poaching trees from their forests. This conflict led the groups ‘without’ CoFMAs to petition the government for similar legal status to protect their forests from these predations49.

Somewhat consistent with this chain of events, our empirical research revealed that communities experiencing high levels of theft invested more in enforcing access rights. However, they were more lax in enforcing use rights and promoting resource regrowth. Specifically, they held fewer planting events (see Supplementary Fig. 5), were less likely to comply with harvesting limits in experiments50, were less responsive to punishment for overharvesting50 and were more likely to favour relaxed harvesting policies51. These findings suggest that the impact of intergroup conflict might interact differently with the demand for access and use rights. This motivated us to investigate the emergence of collective property rights in a multigroup framework, in a model that explicitly distinguished between access and use rights.

Model description

The modelling framework takes the form of agents nested within groups of a fixed size, each with an associated stationary resource. Agents select the resource patch they will harvest from and choose an amount of time spent harvesting the resource or working in ‘wage labor’, via a standard Gordon–Schaefer bioeconomic time allocation model52,53,54.

Collective property rights can evolve via agents allocating resources to support two different costly local institutions. The first institution is directed solely at establishing ‘enforceable access rights via boundary patrols that exclude outsiders’ (hereafter, we call these outsiders ‘roving bandits’55). The second institution defines and enforces citizen obligations through harvesting policies, jointly called ‘use rights’3. Use rights involve specifying and enforcing a maximum allowable harvest (MAH), which is ‘sustainable’ when the MAH is below the maximum sustainable yield (MSY) (MAH ≤ MSY). Designing a sustainable MAH corresponds with Ostrom’s1 second and fourth design principles stipulating that policies must be adapted to local conditions and constructed by local users.

We specifically model the process by which groups construct MAH policies. To do so, we ‘do not’ assume an enlightened social planner as is common in natural resource management studies (see refs. 2,28,29,46,52). Thus, evolving a sustainable MAH is a group-level search process with a hidden target embedded within a set of nested social dilemmas.

To discover the MAH, all agents have evolving private non-enforceable normative beliefs that stipulate their preference for the group’s MAH policy, but these do not affect the individual’s harvest effort directly. Instead, groups continually aggregate these preferences56 to form a group’s MAH policy via the median voter theorem (taking the group-level median)57. In contrast to individual private normative beliefs, this group-MAH policy is enforceable, as it bestows a ‘social license’ to punish those who violate it. Given this policy, agents can invest in monitoring use rights that, when successful, confiscate the goods of any local individual caught harvesting above the MAH.

Over time, as individuals interact and socially learn via payoff-biased imitation from in/out-group members, resource-conserving institutions may evolve that differ in their underlying policy rules (use rights) and investment in access rights (enforceable boundaries). For an overview of the model processes, see Fig. 1; for the full model, see Methods; for full parameter sweeps, see Supplementary Information.

Fig. 1: Model process overview.
figure 1

This graphic shows the sequential process of the model moving each major sub-model and their sub-components.

Results

The causes of intergroup conflict

Individuals are incentivized to engage in roving banditry whenever the net benefits of doing so are greater than that of harvesting in their own territory. In the absence of enforced access rights, the relative costs and benefits of roving banditry are determined by (1) spatial and temporal variation in resource stock30, which promotes banditry by agents from territories with few resources (Fig. 2a); (2) strict policies that limit the MAH and incentivize roving banditry by raising the opportunity costs of harvesting within the home territory (Fig. 2b); and (3) low travel costs (Supplementary Information). Without enforced access rights, the landscape approximates an ideal free distribution under open access.

Fig. 2: The evolution of collective property rights.
figure 2

a, The effect of heterogeneity in patch size (exogenously controlled) on roving banditry (boundaries disabled). b, The effect of MAH policies (exogenously controlled) on roving banditry (access rights disabled). c, Investments in access rights (seizures disabled). Lines show the average investment in access rights for a single group in a single simulation. d, Investment in access rights (seizures enabled). e, Results from the difference equations with parameters set to sustain the resource above a threshold: enforcement of access rights in blue, resource stock in green and bandits in red. f, Replicates e while improving harvesting technology, causing the resource to fall below a critical threshold, impairing the stability of access rights. g, The relationship between roving banditry (exogenously controlled) and investment in monitoring use rights (access rights disabled). h, The relationship between access rights (exogenously controlled) and investment in monitoring use rights. il, The effect of out-group learning (exogenously controlled) on investment in monitoring use rights (i), MAH policies (j), resource stock levels (k) and payoffs (l). m, The covariance between MAH policy and payoffs. Black points marked as ‘not enforced’ have a low investment in self-regulation (R ≤ 0.5) and orange points marked as ‘enforced’ are those that have stabilized support (R > 0.5). n, Tipping point in payoffs as a function of stock level (MSY). o, Groups’ search processes for a sustainable MAH (rate of roving banditry is low (exogenously controlled)). Green lines show intact resources and turn black once the resource has collapsed (‘stock’ ≤ 0.05 maximum capacity). p, Replicates o but with roving bandits allowed (exogenously controlled). We use 100 groups on a 10 × 10 lattice for all simulations, with 30 individuals per group. In gl, each point is the average value in the last 1,000 time steps of the y axis variable from a single simulation after a 4,000-round burn-in. Lines show the average across all simulations. Enf., enforced. For the parameter configuration used, see Methods and Supplementary Information; for full parameter sweep, see Supplementary Information.

Evolving access rights

The establishment and maintenance of access rights is a public good. While enforced access rights may increase group-level payoffs by reducing the number of competitors accessing the patch, the costs associated with creating and enforcing them are borne by individuals. The payoff to free riding means that the invasion of investment in enforcing access rights is highly unlikely (Fig. 2c), except in cases where group selection pressure is very strong or additional mechanisms are in place to encourage investment in access rights (for example, ref. 58). One such mechanism is if there is a partial alignment of individual and group-level interests, such as when agents can keep seized property from bandits or issue trespassing fines. In effect, this ‘wage’ from seizures allows for the invasion and stabilization of enforced access rights (Fig. 2d). However, the individual-level benefits of investment in access rights (wages from seizures) depend on the harvests of bandits (and thus the resource stock indirectly). An outcome of this is that the benefits from seizures are the highest at the start of any simulation, as stocks are high and the system is near open access. The high initial payoff offered by seizures allows monitoring to invade and become common in groups. However, the potential profits from seizures fall as the resource stock declines and enforced access rights deter bandits (see below and Fig. 2d). Yet, if access rights already exist, then group selection (if strong enough) can help maintain enforcement even after the individual-level benefits from seizures are no longer sufficient to create a direct individual-level benefit (see below).

One emergent property of access rights (and use rights) is that investment in them can oscillate over time (Fig. 2d,e). To examine more closely why these oscillations occur, we composed the following set of difference equations focused solely on investment in access rights:

$$\Delta S=\alpha SB-\beta XS$$
(1)

and

$$\Delta X=IXSB-\zeta X$$
(2)

and

$$\Delta B=rB(1-B)-\omega BS$$
(3)

where α, β, I, ζ, r and ω are all parameters, X and S are the frequencies of agents patrolling the boundary (enforcing access rights) and of roving bandits, respectively, and B is the current resource stock measured as a proportion of its maximum (that is, the carrying capacity).

The relationship between access rights and roving banditry is frequency dependent and can be cyclical (Fig. 2e). The level of investment in access rights (X) increases as the number of roving bandits (S) increases, resulting in more seizures and increased investment in access rights. However, as boundary patrols increase, the frequency of bandits decreases, leading to a decline in payoffs from seizures and a subsequent decrease in boundary patrols. This decrease can eventually lead to an increase in the frequency of bandits, resulting in another cycle. If the resource stock declines below a threshold, investment in boundary patrols can fall to zero, as the revenue from seizures no longer covers the institutional costs of enforcing access rights (Fig. 2f). This oscillatory pattern is typical in predator–prey models31,59 and is a function of parameter combinations and model specifications, specifically linear versus nonlinear cost functions.

Access rights allow for multilevel selection

Achieving resource sustainability requires collective property rights regimes to enforce sustainable use rights. However, these use rights can be vulnerable to exploitation by free-riding bandits if access rights are not properly enforced (Fig. 2g,h). Thus, when enforced access rights are absent and bandits are present, there will be little to no individual or collective benefits from enforcing a sustainable MAH. A further consequence of unenforced use rights is that the MAH policy is not under selection and is solely determined by drift (Fig. 2o,p); thus, evolutionary optimization processes cannot find optimal policy levels. In addition, if agents can bypass their group’s policy by harvesting from another group’s territory, local use rights become irrelevant. A global emergence of enforced access rights is necessary to eliminate any ‘safe havens’ where agents can avoid their group’s local policies (Fig. 2h).

Consequently, enforcing access rights creates a positive relationship between group-level use rights and payoffs (Fig. 2k). This is a crucial first step in the evolution of sustainable resource use because groups with sustainable use rights can achieve higher long-run payoffs than those without. However, there remains a conflict between the levels of selection. As such, the long-run equilibria of these systems are determined by the relative strength of selection on different levels of organization (‘groups’ or ‘individuals’), which in our models is a function of the rate of out-group learning (Fig. 2i–l).

Specifically, traits related to sustainable use rights (enforcing use rights and a MAH ≤ MSY) may spread across groups via payoff-biased imitation if agents can learn from individuals outside their group. This is because in-group agents who free-ride (overharvesting, having private MAH beliefs larger than the MSY, and not enforcing use rights (when seizures are inadequate to cover costs)) will have higher relative payoffs within-group, but at the cost of their group having lower absolute average group-level payoffs. When selection at the individual level dominates (out-group learning is low), agents only see the relative payoffs of their own group members. Thus, unsustainable traits spread as defectors are preferentially copied. However, when agents can learn from out-group members, they are more likely to adopt the traits of out-group members from groups with more sustainable use rights because the social learning mechanism can ‘see’ the higher absolute payoffs (see Fig. 2l)32, and this makes them more likely to be copied even though those agents will have lower payoffs when compared with other members of their own in-group. Therefore, as the probability that agents learn from out-group members increases, so does the likelihood that sustainable use rights fixate in the population (Fig. 2i,j).

Access rights help the search for sustainable policies

Finding and stabilizing a sustainable MAH is a difficult evolutionary search process. Access rights help in two critical ways. First, the boundaries constructed via access rights create a foundational covariance between group-level traits and group payoffs (see above), thus allowing selection to operate on the MAH (see Fig. 2n). Second, these boundaries can reduce the stochasticity in socio-ecological systems, creating a more precise target for the MAH search process (Fig. 2o,p).

A central feature of such systems is that group profits are often maximized when total harvests equal the MSY52. Consequently, individual and group selection drive harvests and the MAH towards the MSY (Fig. 2m,n). This is a dangerous bifurcation point because any excess harvest pushes the system towards collapse60. Roving banditry introduces significant stochasticity to the total harvest from a resource and can thus tip the system towards ruin. Enforced access rights reduce this stochasticity by fixing the number of users, increasing the probability of evolving and stabilizing sustainable use rights (Fig. 2o)61.

Discussion

The evolution of sustainable institutions is critically dependent on clearly defined and enforced access rights. Access rights serve a mechanistic role in the evolution of sustainable institutions by (1) enabling users to derive benefits from investments and institutions that improve resource quality, (2) creating groups that can be subject to multilevel selection via out-group learning and (3) reducing stochasticity in systems by limiting unpredictable harvests from outsiders, thereby facilitating the identification and stabilization of sustainable MAH policies.

Seizures help align individual and collective goals

Traditionally, common-pool resources (CPR) governance has been modelled as a public good, yet CPRs often contain tangible physical resources that can be seized and provide individual-level incentives for investing time and resources in governance62. How these seizures are distributed matters; are they a wage, burned in a potlatch, or shared evenly among community members? When seized goods go to pay individuals for their monitoring and patrolling, this can provide enough incentive for the initial evolutionary invasion of the exclusionary enforcement necessary for collective property rights. However, changing this assumption changes the stability of collective property rights.

For example, the Kenyan government famously burned a large stockpile of seized ivory on multiple occasions63. If monitors destroy seizures in this way, boundary maintenance often reverts to a classic public goods problem and generally needs alternative mechanisms to evolve and be maintained, such as immediate collective benefits from well-maintained resources (for example, reputational gains, highly noticeable ecosystem services or eco-tourism) or costs from poorly maintained resources (for example, pollution); direct cash payments from businesses and governments for the provision of ecosystem services may also suffice. Without such mechanisms, groups will still struggle to stabilize investment in monitoring and patrols due to the lack of individual-level incentives (see Fig. 2c,d).

Alternatively, recent discussion regarding benefit sharing mechanisms64,65 suggests that seizures could also be distributed to groups as a collective benefit, such as a feast for village elders responsible for extracting the fine66. When group members receive an equal share of the bounty, regardless of their contribution to monitoring and patrols, the incentive problem is partially abated because monitoring can result in higher payoffs. Nevertheless, it does not solve the collective action problem, as agents still benefit from shirking on monitoring while reaping the benefits from seizures (see Supplementary Information for results).

While seizures can provide an individual incentive for the evolution of collective property rights, our model does not account for the interpersonal conflicts involved in seizing goods or issuing fines. Dominance hierarchies, exclusive control of violence among monitors and formidability (physical and social) probably mediate these costs67. As such, we would predict seizures operating as a wage to be more common when a strong hierarchical or institutionalized power imbalance favours and legitimizes the monitors.

The evolution of collective property rights is sequential

Our results highlight a set of necessary preconditions for the evolution of sustainable collective property rights. Specifically, with intergroup competition, only once access rights are established and roving bandits excluded can groups begin solving the multidimensional challenge of finding and enforcing sustainable MAH policies. Establishing access rights is a crucial step because it fulfills one of the primary requirements of multilevel selection33: a covariance between group-level traits and payoffs. Once this covariance is established, the system requires multiple groups with a sustained difference in their enforced policy levels to create the necessary group-based variability for cultural multilevel selection to operate and allow for sustainable MAHs to evolve via payoff-biased imitation supported by group selection.

Out-group learning promotes sustainable use rights

The strength of selection on groups versus individuals limits the degree to which sustainable use rights can spread through the population. The strength of the selection on groups is a direct function of the amount of out-group learning. This single parameter is fundamental for the spread of sustainable use rights because it determines the degree to which agents update their traits on the basis of relative within-group comparisons or absolute payoffs via between-group comparisons. Comparing relative payoffs within an in-group reproduces a prisoner’s dilemma favouring overharvests and free riding, leading to the tragedy of the commons. However, learning from out-groups allows the covariance between use rights and payoffs to be ‘seen’ by agents, allowing for selection to operate on policies via social learning. While this out-group comparison helps groups find sustainable policies, it can also stop groups from becoming trapped at suboptimal equilibriums that are enforced via excessive enforcement of overly restrictive MAH policies68.

There are two important qualifications regarding out-group learning. First, out-group learning is our model’s sole mechanism that allows group selection to operate. Yet, it has functional similarities to migration and colonization, and should thus not be considered the only mechanism by which sustainable institutions can be transmitted across groups45. Finally, there remains a question as to whether such a high dependence on out-group learning is viable in the first place. Indeed, as most social interactions occur within a group, a high reliance on learning from out-group members might seem unrealistic, or beg the question of ‘under what conditions (if any) do we expect agents to adopt such high rates of out-group learning’. One such case may be when there is high cultural and ecological similarity between groups and low geographic distance, such as in small islands like Pemba.

CPR systems have tipping points that create social dilemmas

There is a crucial caveat when considering the role of group selection in spreading sustainable use rights: ‘the social dilemma in CPR systems only emerges once the resource stock declines past the MSY threshold’. When the stock is greater than the MSY threshold, higher harvests and relaxed policies benefit individuals and groups with no conflict of interest. However, as depicted in Fig. 2m,n (just beyond the red MSY line), there exists a critical tipping point; when the stock level declines to or below the MSY threshold, the social dilemma unfolds: groups achieve higher payoffs by reducing total harvests, while individuals are still faced with the temptation to overharvest.

Once the stock falls below the MSY, group selection becomes the sole mechanism within the model to prevent groups from succumbing to the tragedy of the commons. Consequently, the true significance of group-level selection in fostering sustainability is only apparent when a group’s stock tips towards collapse. Before this tipping point, group selection is no different from individual-level selection and contributes to unsustainable practices by rewarding groups that drive their policies and harvests towards the MSY tipping point. Note that this tipping point dynamic is probably a very general feature of tightly coupled harvesting models, as it was first discovered in the context of group selection in predator–prey dynamics in ref. 31.

Finally, the capacity of groups to alter course and rescue a collapsing resource hinges on standing variation in policies across groups and the strength of selection acting upon them. Consequently, too much out-group learning can erode the group-level variation necessary for group selection and result in lower payoffs (as seen in the far right of Fig. 2l). However, if there is sufficient standing variation, group selection can swiftly drive groups towards adopting more conservative policies as they begin to fall into the overharvesting trap.

Causal inference in dynamic systems needs time-series data

The evolution of collective property rights is dynamic and frequency dependent. Thus, the sign and magnitude of the relationship between any of the system’s features is not a stable feature of the system itself but is rather a function of the relative frequency of the variables in question at any given time. For example, in Fig. 2e,f, we see that the dynamic relationship between investment in access rights and banditry exhibits two distinct stages: banditry drives the need for increased investment in access rights, and the collapse of access rights leads to further banditry. Depending on the phase of this cycle, a cross-sectional analysis may yield a positive, negative, or inconclusive association between banditry levels and investment in access rights69.

Therefore, to adequately study the dynamic evolution of CPR governance, it is essential to gather high-resolution time-series data from a large number of groups, encompassing information on monitoring and patrols, harvest policies, intergroup dynamics and resource stocks. Cross-sectional data can be useful in model calibration (using approximate Bayesian computation70); however, because these systems are rarely stable, they require time-series data and statistical modelling techniques such as those developed in ref. 71 to adequately link models such as ours to empirical data.

Methods

Model overview

The model itself is built upon four submodels: a political submodel based on the median voter theorem57 that determines each group’s MAH policy, a cooperation submodel for provisioning the public good of institutional enforcement (see ref. 28), a traditional Gordon–Schaefer bioeconomic time allocation harvesting submodel52,53,54 and a payoff-biased social learning model (see ref. 32; Fig. 2i–l) that allows for cultural evolution. The one-group version of the model is explored analytically in ref. 50.

Collective property rights comprise a bundle of two subrights; access rights and use rights. Functionally, agents can pay costs to support the two different local institutions to varying degrees. Access rights are directed solely at monitoring group-level resource boundaries and excluding outsiders who try to harvest the resources. In contrast, use rights have two subcomponents: one aimed at determining harvesting MAH policies and the other at enforcing them.

Topologically, agents are nested within N groups on a grid with one group per cell. All groups have a stable population of n infinitely long-living agents. Each group, and thus each grid cell, has an associated resource pool B from which, at baseline, any individual can harvest regardless of group membership. Therefore, a group can be considered a village with an associated forest/fishing lake within its boundaries and which it could potentially defend. At the start of all simulations, however, the resource is essentially open access as there is no previous enforcement of access rights. The resource in each grid cell has two primary state variables that define it: its maximum size and its regrowth rate.

Five essential traits define agents: (1) Effort, e, where \(e \in \Re, \ 0 \leq e \geq 1\), is the proportion of time they spend harvesting the resource. (2) Roving banditry, s, where \(s \in \{0, 1\}\), determines whether or not they harvest from their own patch. (3) Investment in patrolling resource boundaries, x, where \(x \in \Re, \ 0 \leq x \geq 1\), is the amount of resources invested in monitoring access rights. (4) Investment in monitoring in-group members, r, where \(r \in \Re, \ 0 \leq r \geq 1\), is the amount of resources invested in enforcing use rights. (5) Agents’ policy preference, m, where \(m \in \Re, \ 0 \leq m\), is their private belief about what the group’s MAH policy should be. The frequencies of these traits in the population are our primary outcome variables and dynamically evolve throughout the simulation via social learning.

The i and g subscripts index individuals and groups such that individuals are numbered 1…n in each group and group IDs range from 1…N. The model advances in discrete time steps52, with each time step progressing through five stages.

Politics

First, for each group, the political submodel calculates the ‘median’ value of the group’s policy preference mig, thereby constructing the group’s policy (MAHg). MAHg sets the upper bound of ‘socially approved’ individual harvests, hig, that an in-group member can extract from their local resource without fear of having their harvests confiscated if inspected by agents enforcing use rights.

Institutions

Next, the institutions submodel has agents allocating resources (xig and rig) to increase the efficacy of patrols that enforce access or use rights. As 0 ≤ x ≤ 1 and 0 ≤ r ≤ 1, to determine the net costs borne by individuals from their investments, x and r are scaled by the institutional enforcement cost parameters cx and cr, respectively. Investments in either institution are made separately but from the same resource pool (the agent’s payoffs).

The probability that either institution inspects an agent’s harvest is I and is determined by the total group-level contribution to the particular institution (Xg and Rg), such that Ixg is the inspection probability for access rights in a particular group.

$${I}_{xg}={\left(\frac{{X}_{g}}{\delta }\right)}^{\phi }$$
(4)

and Irg is the inspection probability for use rights in a particular group

$${I}_{rg}={\left(\frac{{R}_{g}}{\delta }\right)}^{\phi }.$$
(5)

Xg and Rg are the group’s total contributions to monitoring access rights and use rights, respectively. Defensibility, δ, is the maximum number of contributions needed to completely defend the resource, such that the institution would inspect all agents with a probability equal to one. ϕ is the elasticity of resources on inspection efficacy, otherwise known as the monitoring technology. In all simulations, this elasticity is set to 1. Thus, investments in each institution translate linearly into inspection probabilities, such that if X/n = 0.5 (that is, on average, group members contribute 50% of the maximum), then I = 0.5, and there is a 50% chance that either an out-group member harvesting from that patch will have their harvest inspected (Ix), or an in-group member harvesting locally will have their harvest inspected (Ir). More generally, I can be considered the outcome of a public goods production function.

Harvesting

After contributions to the institutions are determined, agents ‘go to work’ in the harvest submodel. Agents dedicate a portion of their total effort eig to harvesting from the natural resource and 1 − eig doing any other kind of work. Thus, the base payoff from the harvesting model is:

$${\pi }_{ig}=p{h}_{ig}+w(1-{e}_{ig})$$
(6)

where

$${h}_{ig}=q{e}_{ig}^{\alpha }{B}_{l}^{\beta }$$
(7)

where p is the price of the goods harvested from the resource, w is the opportunity cost of harvesting the good (the value of all non-harvesting activities an individual could partake in), hig is the individual-specific harvest from the resource, q is the catchability parameter (or total factor productivity), Bl is the current resource stock at the location from which the agent is harvesting, and α and β are the elasticities of labour and resource stock on production, respectively. Note that the functional form of the production model is a Cobb–Douglas production function.

The roving banditry trait, sig, determines whether the agents harvest locally or travel to other communities. If sig = 1, the agent senses a subset of non-local patches with a probability determined by their Euclidean distance and selects the patch with the highest stock. They then travel there temporarily and harvest for that time step before returning home. The particular patch that any agent travels to to harvest from is denoted by lig. The agent harvests from their home patch if sig = 0.

The total harvest from any particular patch, Hg, is defined as follows:

$${H}_{\it{g}}=\mathop{\sum }\limits_{i=1}^{n}\left({h}_{\it{ig}}| {s}_{\it{ig}}=0\right)+\mathop{\sum }\limits_{j=1}^{N-1}\mathop{\sum }\limits_{i=1}^{{n}_{j}}\left({h}_{ij}| {s}_{ij}=1,{l}_{ij}=\it{g}\right)$$
(8)

where j is the set 1…N excluding the focal g. Once agents finish harvesting, the effects of the institution submodel are realized as each ‘institution’ separately attempts to monitor, inspect and seize goods.

Use rights

As previously stated, investments in enforcing use rights increase the probability that in-group members have their harvest inspected and potentially seized. If an agent who harvested locally and above the MAH (hig > MAHg and sig = 0) is inspected, then in that case, their harvest is confiscated and the value of the goods seized, zgir, is distributed to agents who invested in monitoring use rights in proportion to their total contribution:

$${z}_{gir}={I}_{rg}p\frac{{r}_{ig}}{{R}_{g}}\mathop{\sum }\limits_{i=1}^{{n}_{g}}({h}_{ig}| {h}_{ig} > {\rm{MA{H}}}_{g},{s}_{ig}=0).$$
(9)

Access rights

The goal of investing in access rights is to find out-group members harvesting from the group’s resources and confiscating their goods. If an out-group member (sij = 1 and lij = g) is inspected, their harvest will be seized regardless of harvest size and MAH. The total seizures from out-group members, zgix, are paid out to individual agents as follows:

$${z}_{gix}={I}_{xg}p\frac{{x}_{ig}}{{X}_{g}}\mathop{\sum }\limits_{j=1}^{N-1}\mathop{\sum }\limits_{i=1}^{n}\left({h}_{ij}| {s}_{ij}=1,{l}_{ij}=g\right).$$
(10)

Payoffs

Finally, dropping the subscripts, we can define three separate payoff functions:

$$\pi =\left\{\begin{array}{ll}ph+w(1-e)-{c}_{x}x-{c}_{r}r+{z}_{r}+{z}_{x}-{c}_{s}s,\quad &{{{\rm{if}}}}\,h\le {\rm{MAH}}\,\& \,s=0.\\ (1-{I}_{r})ph+w(1-e)-{c}_{x}x-{c}_{r}r+{z}_{r}+{z}_{x}-{c}_{s}s,\quad &{{{\rm{if}}}}\,h > {\rm{MAH}}\,\& \,s=0.\\ (1-{I}_{{x}_{l}})ph+w(1-e)-{c}_{x}x-{c}_{r}r+{z}_{r}+{z}_{x}-{c}_{s}s,\quad &{{{\rm{if}}}}\,s=1.\end{array}\right.$$
(11)

Note that Ixl is the inspection probability of the group that an agent who is a roving bandit has travelled to.

Social learning

The fourth submodel is social learning. Here, agents use payoff-biased social transmission to update all possible traits. With probability (1 − ι), agents learn from members of their own group, and with probability ι they learn from members of a different group. If agents learn from an out-group, they simply sample one group at random. Regardless of whether agents learn from the out-group or in-group, they sample a number of agents (models) from the group and select the one with the highest payoff, compare that model’s payoffs to their own, and if the model’s payoff is higher, they copy all of that model’s traits72 with a small probability of copying error.

Resource dynamics

Finally, the resource in each grid cell follows a standard logistic growth curve with a stock level of Bgt, a carrying capacity of k and a growth rate of v and has the following dynamics:

$${\hat{B}}_{gt}={B}_{gt}-{H}_{gt}$$
(12)

where \({\hat{B}}_{gt}\) is simply the resource stock after harvests have been removed. Thus, the resource regrows as follows:

$${\hat{B}}_{gt+1}={\hat{B}}_{gt}+{\hat{B}}_{gt}v\left(1-\frac{{\hat{B}}_{gt}}{k}\right).$$
(13)

The right-hand term is the new additional growth, which we can define as \({g}^{{\prime} }({B}_{gt})\):

$${g}^{{\prime} }({B}_{gt})={\hat{B}}_{gt}v\left(1-\frac{{\hat{B}}_{gt}}{k}\right).$$
(14)

Importantly, gʹ(Bgt) is maximized when Bg = k/2, and this threshold is known as the maximum sustainable yield. The total new recruitment at this threshold is the MSY and is equal to:

$${\rm{MSY}}=\frac{kv}{4}$$

Therefore, if harvests are above the MSY and do not adjust downwards, the resource stock level will equilibrate at the open-access equilibrium54.

Parameterization

Our parameterization was carefully selected on the basis of the following criteria:

  • If groups failed to establish sustainable use rights, the resource would collapse.

    1. 1.

      This ensures that any sustainable MAH is always lower than the MSY.

    2. 2.

      This ensures that Hg = MSY is the socially optimal outcome when the system has no stochasticity.

  • For Ir = 1 and Ix = 1, all agents within the group must fully contribute to monitoring.

  • Some groups are seeded with unsustainable MAH policies, while others are seeded with sustainable MAH policies.

  • The model must check for invasion criteria.

Given these requirements, all simulations were run using the parameters found in Supplementary Table 1. Finally, the initialization values for m were seeded by providing each group with a mean sampled from \(\tilde{{m}_{g}} \sim \rm{Uniform}(0.1,4)\). Each individual then receives an offset mgi ~ N(0, 0.1) with a boundary condition at 0. Full parameter sweeps can be found in Supplementary Information.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.