Quantifying Earth system interactions for sustainable food production via expert elicitation

Several safe boundaries of critical Earth system processes have already been crossed due to human perturbations; not accounting for their interactions may further narrow the safe operating space for humanity. Using expert knowledge elicitation, we explored interactions among seven variables representing Earth system processes relevant to food production, identifying many interactions little explored in Earth system literature. We found that green water and land system change affect other Earth system processes strongly, while land, freshwater and ocean components of biosphere integrity are the most impacted by other Earth system processes, most notably blue water and biogeochemical flows. We also mapped a complex network of mechanisms mediating these interactions and created a future research prioritization scheme based on interaction strengths and existing knowledge gaps. Our study improves the understanding of Earth system interactions, with sustainability implications including improved Earth system modelling and more explicit biophysical limits for future food production. Determining the safe operating space for sustainable food production depends on the interactions of multiple processes within the Earth system. Expert knowledge provides critical insight into how these processes interact that improves Earth system modelling and our understanding of the limits of global food production.

F ood production is the main cause of environmental impacts such as biodiversity loss 1,2 , eutrophication 3 and overexploitation of marine resources 4 . In particular, agriculture drives 80% of global deforestation and 70% of freshwater withdrawals 5 . Human pressures may have already pushed the Earth system beyond the safe operating space for humanity, as demarcated by the nine planetary boundaries describing critical Earth system processes 6,7 . Beyond these boundaries, the risk of abrupt or irreversible global environmental change increases, with the potential to push the Earth system out of its stable Holocene condition, thus threatening the capacity of humanity to develop and thrive 7,8 .
Recent literature and modelling studies 9,10 provide the first quantitative estimates of an array of interactions among the Earth system processes represented by the planetary boundaries. Their findings suggest that the cascades and feedbacks of interactions amplify human impacts on the Earth system. Accounting for these interactions may narrow the estimated global safe operating space for human activities but also illuminate synergies: decreasing impacts on one Earth system process may decrease impacts on others. Given that many interactions relevant to food production are not quantified by existing studies 9,10 and that marine processes have received relatively little attention within the planetary boundaries literature 11 , achieving sustainable food futures may require even greater food system reforms than previously suggested 12 .
Interactions among Earth system processes are challenging to account for in food system analyses and spatially disaggregated models. The interactions and interaction strengths are partially unknown; they can be context-specific and not all control variables are suitable for spatial models. Thus, research on respecting the global safe operating space is complemented by exploring how to stay within critical limits of ecosystems at smaller scales 13 . The effects of perturbations on key Earth system processes are often manifested locally 1,[13][14][15] , aiding in detecting ecosystem-level impacts of interactions. Further, advances in climate, ocean and terrestrial modelling [16][17][18] enable exploration of complex interactions, improving understanding of their roles in climate and sustainability actions and outcomes 12 .
Here, we concentrate on four key Earth system processes relevant to food production 12 that probably have already been transgressed 5 : biogeochemical flows, biosphere integrity (BI), freshwater use and land system change. We divide BI into land, freshwater and ocean components and freshwater use into blue and green water (Methods; Figs. 1 and 2a). We thus evaluate interactions among seven terrestrial and aquatic control variables-that is, the functional indicators of the underlying Earth system processes (Fig. 2a). All included Earth system processes are bottom-up in nature, as opposed to, for example, climate change, which is a top-down process and for which interactions with other Earth system processes have been explored previously 9,10 .
Our research has two major contributions: (1) understanding the richness of interactions and interaction mechanisms among Earth system processes relevant to food production and (2) quantifying the strength of these interactions by expert elicitation. Our map of a wider array of Earth system processes than previously available (for example, ref. 9 ) provides unprecedented information on the interaction mechanisms and benefits Earth system modelling among other fields. Our quantification of interaction strengths advances our understanding of the safe operating space for food production, as few quantifications of these interactions are available. While ref. 9 provide some literature-based quantifications, model-based quantifications are rare because they are time-consuming and require focusing on a very limited set of interactions at a time. For example, ref. 10 quantify interactions among three Earth system processes at regional scale (climate change, surface runoff and vegetation cover), while we include seven at local scale (Fig. 1). In sum, our work increases comprehensive quantitative knowledge of Earth system interactions relevant to food production.
The expert knowledge elicitation was conducted following the IDEA (investigate, discuss, estimate, aggregate) structured elicitation protocol 19 (Methods). In an expert knowledge elicitation, judgements of an unknown phenomenon can be expressed quantitatively in a statistical format 20 together with collecting qualitative data on the experts' reasoning behind their responses. Thus, the results can be used, for example, as model input or in decision-making when limited or no other information is available 21,22 . As we are very early in the process of exploring interactions among Earth system processes relevant to food production, we argue that expert knowledge is an excellent first step in gathering information accumulated through experts' training and experience. The elicitation was done for a hypothetical study area with a scenario-based technique 23 (Fig. 2b; Methods), guiding experts in making inferences relative to a fixed baseline. This allowed us to account for state-dependent interactions and explore a wider array of interactions compared to strictly model-based quantifications.
With this study, we provide a systems view of the interactions, their strengths and the mediating mechanisms until more model-based results become available. We envision that our results will benefit the identification and prioritization of synergistic actions on developing food systems and be useful for modelling the Earth system as well as food systems and agriculture, for natural resource managers and for the development of the planetary boundaries framework.
(1) Earth system processes of interest

Results
In the following subsections, we present the identified interactions, their strengths and roles and we describe the mechanisms mediating them. Finally, we compare our findings with relevant literature and make suggestions for future research prioritization.
Identified interactions and their roles. Experts identified 37 direct biophysical interactions between the selected control variables (out of a possible total of 54), which suggests considerable local interconnections of Earth system processes (Fig. 3). Some of these interactions, such as the impacts of land system change on BI land, are Biodiversity intactness index (BII)* A proxy for functional diversity. BII assesses change in population abundance as a result of human impacts. The global boundary is set at 90% BII relative to pre-industrial levels.
Biomass of keystone fish species Forested land area* Leached N concentration in runoff to surface waters Seasonal river discharge A proxy for ecosystem functioning. Biomass ≥ 0.5 K is considered within safe biological limits. K is carrying capacity, the maximum size the population can reach in its environment.
Forest cover relative to potential forest in biome scale. The boundary is set at 85% for tropical and boreal biomes and 50% for temperate biomes. Global boundary at 75% of original forest.
A proxy for N concentration in surface waters. Concentration in runoff should not exceed 1 mg N l -1 to prevent aquatic ecosystem eutrophication.
River discharge should stay above local environmental flow requirements, which range from 45 to 75% (low flow season) and 15 to 45% (high flow season) of pre-industrial flows.
Growing season soil moisture Growing season root-zone soil moisture is used as a proxy for various Earth system functions of green water. It is assessed relative to average pre-industrial conditions.  control variables indicated with an asterisk are the same as defined in ref. 7 . For control variables without existing boundary values, safe ranges were developed and set (Supplementary Section 2). b, A hypothetical area of 100 km 2 with control variables within safe ranges was used to assess the interactions among the earth system processes relevant to food production by experts in this scenario-based elicitation.
well known and documented (for example, refs. 1,14 ) but many of them, such as the interactions between the aquatic (BI freshwater and BI ocean) and BI land components, have not been previously quantified ( Fig. 3 and Supplementary Table 5.1). Additionally, some interactions identified in this work were considered non-existent in ref. 9 .
Experts estimated strong impacts on aquatic biodiversity (BI freshwater and BI ocean) originating from other Earth system processes. For example, BI freshwater was seen to be especially affected by changes in blue water, biogeochemical flows and BI land (Fig. 3). Large negative impacts on BI freshwater can be caused by, for example, increased nitrogen concentration in surface waters  and decreased discharge, while a decrease in BI freshwater can substantially influence BI land and BI ocean. This reflects the importance of considering all three components in future BI assessments. Green water and land system change are involved in many strong interactions ( Fig. 3 and Supplementary Table 5.1): for example, a decrease in soil moisture can directly impact blue water, BI land and land system change. At the same time, soil moisture can be reduced when forest cover decreases, reinforcing the feedback effects. Land system change, as expected, is a major cause of changes in other Earth system processes, notably BI land and blue water (Fig. 3). Very few interactions were found to be attenuating, which would mean that when one variable moves towards the outside of the safe space, the other variable moves towards the safe space. The interaction from land system change to blue water was the only one that experts identified as strongly attenuating, whereby a decrease in forest cover leads to increased river discharge (Fig. 3). However, experts agreed that this might hold only at the local scale, as large-scale decreases in forest cover tend to cause regionally drier conditions due to decreased precipitation 24 . Experts also identified a weak attenuating influence of biogeochemical flows on land system change, as increased nitrogen in runoff can boost plant productivity; however, substantial increases in nitrogen may turn the direction of this interaction to amplifying (Prioritization of interactions). As attenuating interactions are rare, parallel impacts of amplifying interactions-in which further perturbation of a control variable negatively perturbs others-are likely. Opportunistic actions benefiting from trade-off situations, in which further perturbation of a control variable would improve the state of others, are therefore not viable but improving the state of many control variables through acting on a single variable is possible.
For seven of the interactions, the interaction strength was judged extremely weak (interaction strength <0.005 on a scale 0-1). We tested whether including the extremely weak interactions in the force-directed network presented in Fig. 4 would change the results. As it did not, we excluded these interactions from further analysis and the prioritization scheme outlined in Prioritization of interactions. The weak relationships may reflect truly weak interactions but it may also be the case that they are present (and important) only in specific environments. Further, they may be more complex than the others, not following simplified linearity assumptions and therefore have not been well-characterized in existing studies (Supplementary Section 5). For further information, relevant literature for the identified interactions and all individual answers of the second elicitation round are provided in Supplementary Data.

Identified interaction strengths in line with literature.
Seven of the 37 identified interactions (Fig. 3) were quantified at the global scale by ref. 9 . For these seven interactions, our study and ref. 9 agree on the direction of interactions, although with some differences in strength. A strict comparison between the two studies is not possible due to differences in normalization, the control variables and the spatial scale but relative comparisons are shown in Supplementary  Table 6.1. For five interactions, the interaction strengths were similar (from low to high interaction strength range). For two interactions (land system change on blue water and biogeochemical flows on BI ocean), the difference in interaction strength can be attributed to the different control variables and spatial scale used. For further comparisons between our results and quantitative estimates of selected interactions, see Supplementary Section 6. The adequate correspondence between our results and literature shows that expert opinions, elicited via a rigorous and formal protocol, capture the variation that individual studies have identified. Therefore, the relative interaction strengths identified here are robust but the interaction strengths should be interpreted cautiously outside this work, preferably in conjunction with quantitative modelling.
Receiving and originating Earth system processes. The role of different Earth system processes in the interactions varies, as shown in the network of the identified interactions and their strengths (Fig. 4). The three Earth system processes with the greatest number of identified connections in the network are BI land, land system change and green water (8-10 out of a maximum 12; Fig. 4 and Supplementary Table 5.3). On the basis of the net receiving and originating interaction strengths (Supplementary Table 5.3), three main categories emerge: (1) processes mainly on the receiving end, meaning that they are affected by others; (2) processes mainly on the originating end, meaning that they affect others; and (3) processes that are both receiving and originating at similar levels. This information could help in identifying impactful practical actions; for example, alleviating green water perturbations would be expected to alleviate perturbations on BI land, land system change and blue water as those receive strong interactions originating from green water. The three BI components comprise the first group, as they receive the greatest net interaction strength (Figs. 3 and 4 and Supplementary Table 5.3). This is in line with two studies 7,9 that identify BI as one of the two core Earth system processes considered in the planetary boundaries framework. Land system change and green water exemplify the second category, as the greatest net interaction strength originates from them (Figs. 3 and 4 and Supplementary Table 5.3). These findings are consistent with ref. 9 identifying land system change as a major mediator of interactions among Earth system processes, refs. 25,26 suggesting that focusing only on blue water does not capture all the crucial Earth system functions of freshwater and ref. 27  Mediating mechanisms. The experts identified an array of primary and case-specific mechanisms mediating the interactions (Supplementary Section 7 and Supplementary Fig. 7.1). The main mechanisms are shown in Fig. 5, which illustrates that the interactions  indicate that an increase (decrease) in one variable leads to an increase (decrease) in another variable. Negative links (red) indicate that an increase (decrease) in one variable leads to a decrease (increase) in another variable. the network of mechanisms is not exhaustive because some links are uncertain (Fig. 6) and dependent on impact level, spatial scale, temporal dynamics and earth system processes beyond the scope of this study.
are complex and interconnected, even when limiting the scope to Earth system processes relevant to food production. Moreover, different mechanisms can have counteracting effects on the control variables, leaving the resulting interaction to be determined by the presence, importance and characteristics of different mechanisms. For example, forest cover is negatively impacted by wildfires, droughts and seedling consumption but benefited by pollinators, seed dispersal, decomposers, soil moisture and terrestrial productivity. These mechanisms vary substantially: for example, wildfires affect forest cover extremely quickly, whereas forest growth is affected very slowly. The net interaction strength is therefore tightly dependent on which mechanisms prevail at a time. Due to this, we aimed to explore the relative importance of the different mediating mechanisms during the second elicitation round. While we were unable to make robust inferences on the matter, the limited data related to ranking, together with details and relevant literature on the identified mechanisms, are available in Supplementary Data. Understanding the mediating mechanisms is key to finding action points to establish beneficial synergies. For example, decreasing forest cover (land system change) and soil moisture (green water) increase erosion, which further increases N loss and sediment flow, both of which ultimately lead to negative changes in BI freshwater, BI ocean and biogeochemical flows. Preventing erosion or repairing its impacts, could then be used as an effective mechanism to alleviate the pressure on many control variables at once. Other high-impact actions include, for example, improving riparian and coastal habitat integrity through conservation measures or agroecosystem integrity, which would directly affect BI freshwater and BI ocean and also promote mechanisms affecting soil moisture, river discharge and forest cover, primarily through increased terrestrial productivity and its effects on the hydrological cycle. Although impacts can be expected to diminish along long mechanism chains and vary between contexts, the rich network of mechanisms highlights that actions often affect an ensemble of the control variables instead of only one at a time.

Prioritization of interactions.
The experts had high agreement on interaction magnitudes; the coefficient of variation in their estimates was low in all but one interaction (Supplementary Table 5.2).
The number of responses per interaction, however, varied considerably, which increases the uncertainty of the results. Due to the potential of increased bias and error owing to the lower number of responses, we created a prioritization scheme for future research on the basis of the interaction strength and uncertainty (Methods; Supplementary Table 5.2). We suggest that focus should first be given to strong interactions with limited current knowledge and/ or high uncertainty. Those interactions could unexpectedly breach the safe operating space, as the interaction strength is high but relatively little is known about the mediating mechanisms. In addition, we highlighted interactions with discrepancies in expert opinions regarding whether they were amplifying or attenuating. These discrepancies could be related to temporal scales (variability in the occurrence time of processes), different regional contexts (variability in local environmental factors) or different mediating mechanisms (variability in which mechanisms dominate the net interaction). For such interactions, regional-scale studies should be prioritized to identify the importance and temporal scales of the involved mechanisms. In cases of discrepancies, we determined the interaction direction in Fig. 3 by the majority of responses.
The high-uncertainty interactions from biogeochemical flows to BI freshwater and BI ocean are prime examples of expert opinion discrepancies (Fig. 6). Some experts considered the positive impacts on primary productivity and ecosystem functioning from added nutrients-before a critical limit is passed and impact becomes negative-while others considered added nutrients causing immediate negative impacts. This critical limit is very context-specific, including factors such as the denitrification potential of riparian wetlands 28 . An appropriate critical limit might vary among environments-thus, this interaction needs more case-by-case examination. In addition, including elements such as phosphorus 29 could substantially modify the strength of biogeochemical flows interactions on BI freshwater and BI ocean 30 . Other notable uncertainties include, for example, the interaction from blue water to BI ocean, which was established but which is especially uncertain due to the low number of responses.
Medium-uncertainty interactions related to the BI components deserve further attention due to them being overall strongly impacted ( Fig. 6 and Receiving and originating Earth system processes) and their central role in the Earth system 7,9 . This applies in particular to interactions involving the aquatic BI components, as they have only recently been discussed and explored in relevant literature (for example, ref. 11 ). Most interactions related to land system change, blue water and green water are low-uncertainty interactions (Fig. 6), as expected given that these Earth system processes and their relationships have been widely explored (see Supplementary Data for relevant literature). However, some interactions with a greater number of responses (for example, land system change on blue water) show discrepancies related to the direction of the interactions, even though agreement on the magnitude of the interaction strength is high (Fig. 6). Again, this could be attributed to differences in contexts and timescales, which highlights that case-by-case approaches are required for operationalizing our findings.

Discussion
We have identified interactions and mediating mechanisms between Earth system processes relevant to food production at the local scale and quantified their strengths using expert knowledge elicitation. Out of 54 possible interactions, 37 were identified by the experts. BI freshwater, BI ocean and green water were found to have crucial roles in the interactions. In addition, our results highlight the major role of land system change impacting other Earth system processes and the high impact of biogeochemical flows on BI freshwater and BI ocean (Figs. 3 and 4). Our study maps the mechanisms involved in interactions among these many Earth system processes in detail, revealing a complex and interconnected mechanisms network (Fig. 5). Taking advantage of synergistic mechanisms, through which many control variables can be affected by change in one, is key to limiting further anthropogenic perturbations on the Earth system. Our categorization of the interactions on the basis of their strengths and associated uncertainty (Fig. 6) can guide future research.
Bridging the local and global scales. By illuminating the complex interactions between the Earth system processes relevant to food production, our results highlight the need for an holistic approach to environmentally sustainable food production and suggest potential future developments of the planetary boundaries framework. The planetary boundaries were developed to understand the Earth system limits within which humanity can thrive 6,7 and the framework has often been seen as a connector between Earth system and sustainability sciences 31 . At the same time, the framework has been criticized for being a strictly top-down concept, while many of the relevant Earth system processes and stresses occur locally-although with global importance [32][33][34] . Moreover, interactions between many of the Earth system processes take place on a local-to-regional scale. Without understanding these interactions and the mechanisms mediating them, governance that aims to keep us within safe global boundaries could be critically misdirected and defeat its purpose.
Our findings reveal the directions and strengths of many interactions between Earth system processes relevant to food production. Since most interactions identified here were amplifying ones (Fig. 3), the safe operating space may be narrowed by one Earth system process degrading other Earth system processes. However, finding positive synergies is also possible: alleviating pressures on one Earth system process, such as land system change, can simultaneously alleviate pressures on others, such as BI and biogeochemical flows, through the complex web of interactions among them (Figs. 3-5; ref. 9 ). By better understanding both the strength and the mediating mechanisms of these interactions, our results have clear implications for sustainability management in (1) avoiding unintended consequences of actions; (2) emphasizing synergistic solutions to sustainability challenges; and (3) identifying and prioritizing management of the core processes that most impact and are impacted by other Earth system processes. Our study is thus an important step towards enabling more comprehensive consideration of Earth system processes across sectors and disciplines until more model-based information on interactions becomes available to enrich and improve our findings.
Our work could help to adapt the planetary boundaries framework to the scale at which management of Earth system processes-such as conservation of natural areas, limitation of nitrogen application and regulation of water withdrawals-typically occurs. Staying within the planetary boundaries and thereby keeping humanity within the global safe operating space, requires adjusting local safe operating spaces with respect to the related Earth system processes and interactions shown here. Our findings may also augment local and regional food system models and assessments. Detailed, model-based quantification of interactions between Earth system processes is time-consuming and few are currently available 10 . Therefore, incorporating our findings into models, until other sources of information become available, could enable quantifying aspects beyond the use of resources only-such as the substantial impacts on BI. Taking advantage of all data-while acknowledging the related uncertainties-should be preferred to waiting for the 'perfect data' and potentially delaying action, thereby causing irreversible damage.
At the same time, it should be noted that the mechanisms mediating the interactions, interaction strengths and even the interaction directions vary in different contexts, while the results presented here show only their aggregate outcomes. Further, a better understanding of local-scale mechanisms potentially cascading to planetary-scale feedbacks is needed for prioritizing management actions. Globally consistent datasets for many of the control variables are already available (for example, refs. 1,[35][36][37][38][39][40] ). These datasets can be used in local-scale models to validate our interaction strengths while retaining global consistency, which would reduce the uncertainty in comparing results between different local contexts. However, this endeavour would require case-specific knowledge and sophisticated models. Therefore, while we have validated our results with comparisons to ref. 9 and various case studies (Supplementary Section 6), we leave extensive local-scale validation for future work. Here, we provide both the interaction strengths and mechanisms together with a roadmap of prioritization that can guide future efforts.

Importance of expert knowledge.
In modelling, process dynamics should be represented in mathematical terms but an exact representation of the Earth system is beyond our current capabilities and models will necessarily be based on simplified process descriptions. Deciding which subprocesses to model, what assumptions to make and how to represent interactions between the processes when all mediating mechanisms cannot be fully modelled are all expert decisions 41 . Furthermore, as we have shown here, the mediating mechanisms vary with local context. This is also reflected in an expert elicitation; the experts have their own backgrounds that affect their views and decisions 42 -such as different fields of study or familiarity with different natural environments-which may lead to apparent discrepancies.
This expert knowledge elicitation revealed that not all experts agreed on the direction of some interactions (Fig. 6). Such occurrences are valuable, as they document differing views and beliefs about the same process as well as statistical variability. Combined, expert responses provide holistic views-beyond statistical variability-of complex phenomena that are difficult to study in quantitative terms. Even though modelling has traditionally been based on collected quantitative data, the further we increase the complexity of what we aim to model, the higher the data demand becomes. Thus, using expert knowledge has become common and is applied in different fields and for various purposes-for example, in ecosystem modelling 43 , risk assessment 44 and even augmenting machine learning models 45 . Our expert elicitation on complex Earth system process interactions is a notable step forward in better understanding of processes that would otherwise remain unknown.

Methods
The following section describes the main steps in our methodology as illustrated in Fig. 1. We first explain the control variables we used for each of the Earth system processes of interest (Fig. 2a) and the hypothetical study area (Fig. 2b) and introduce the structured elicitation protocol. We further describe the methods used to aggregate and normalize the elicitation results to quantify the interaction strengths. For further details, especially related to the elicitation process and protocol, see Supplementary Sections 1-4.

Definitions of Earth system process control variables.
For the BI land component, we retained the control variable used by ref. 7 , the biodiversity intactness index, a proxy variable for functional diversity (Fig. 2a). Agricultural impacts on BI have been demonstrated at a local scale 1 and retaining the existing variable therefore fits our purposes well. The BI freshwater and BI ocean components were included on the basis of recent suggestions of their importance in BI assessments 9,11 . For both BI freshwater and BI ocean, we used the status of keystone fish species biomass as a control variable. Others 9 assign ecosystem functioning as the control variable and use global fisheries status as a proxy for some of the interactions they identify; thus, our control variable is similar as they both assess biomass levels. In aquatic environments, keystone fish species act as a robust indicator of ecosystem functioning and play a critical role in determining community structure [46][47][48][49] . In addition, freshwater habitats in particular have experienced a substantial decline in biodiversity due to human activities and environmental change 50,51 . Therefore, keystone fish biomass can act as a control variable to assess the aquatic components of BI.
The control variable for land system change is forested land area relative to potential forest cover (that is, assuming no human land cover change; Fig. 2a)-a variable retained from ref. 7 . For biogeochemical flows, we assessed nitrogen, using leached inorganic N concentration in runoff to surface waters as the control variable 52 . For blue water, the control variable used is river discharge, relative to the pre-industrial average. In addition, to account for seasonal variation in river flows, we separated the blue water interactions into effects during high-and low-flow periods. While others 7 propose maximum allowable water withdrawals, we focused on the flow remaining in rivers after any discharge alteration. Extending the control variable beyond withdrawals captures discharge alteration due to both direct human impacts, such as water extraction 53 and indirect human impacts, such as climate change 54 and changes in atmospheric moisture recycling 55 . For green water, the control variable we used is root-zone soil moisture during the growing season relative to the pre-industrial growing season average (Fig. 2a), similar to the control variable suggested in ref. 27 . Though green water is not identified as a separate control variable within the freshwater use boundary of the original planetary boundaries framework, recent research 25,26 proposes that focusing only on blue water does not capture all crucial Earth system functions of freshwater and thus we considered green water to be indispensable. For more details on the control variables used, see Supplementary Section 2.
Elicitation process. Expert knowledge elicitation has been applied within various fields of environmental sustainability-focused research (for example, refs. [56][57][58][59][60] ) and its suitability for natural resources management has been demonstrated (see for example, applications in refs. [61][62][63]. With an elicitation, we can formulate expert knowledge and beliefs about potential uncertainties into a probabilistic form 64 that can subsequently be treated as empirical data 63 and used for modelling purposes when other data are unavailable. Here, we followed the structured IDEA elicitation protocol 19 for a remote expert knowledge elicitation. Structured elicitation protocols help reduce bias and error associated with heuristics that experts use when making judgements 65,66 . The IDEA protocol is a structured modified Delphi approach that leads to improved judgements when a diverse group of engaged experts participate 21 . It combines the benefits of Delphi 61,67 and four-step elicitation processes [67][68][69] , which in combination has been shown to improve judgements 65,70,71 . Our elicitation process consisted of two anonymous elicitation rounds, between which was an online discussion round using pseudonyms (Fig. 1). Details of the elicitation protocol are available in Supplementary Section 1.1 and Supplementary Table 1.2. The discussion round is a critical part of the process, as it decreases linguistic ambiguity, promotes critical thinking and shares evidence. The IDEA protocol integrates elicitation and discussion because there is evidence that, when a discussion stage is included in a standard Delphi process, the response accuracy of the second elicitation round increases 72 . The discussion phase also proved to be beneficial as it decreased ambiguity regarding the questions and increased agreement among experts (Supplementary Section 1.4 and Supplementary Fig. 1.1 with details related to the discussion round and Supplementary Table 1.5 with the agreement metric in the two elicitation rounds).
Participants were recruited on the basis of their expertise in any of the Earth system processes considered in this study and their knowledge of the planetary boundaries framework. For the recruitment process, relevant literature on Web of Science was searched and, once a list of 200 potential participants was reached, the literature-based recruitment was concluded (Supplementary Section 1.3). All 200 potential participants were invited to participate in the elicitation via e-mail. In addition, the 'snowballing method' was used: when potential participants were first contacted, they were asked to suggest further suitable participants. This resulted in inviting 31 additional potential participants. Earlier planetary boundaries work has focused on certain Earth system processes (especially land system change, blue water and BI land) and, thus, representation of all desired disciplines was not equal and was reflected in the number of answers received for each interaction. In total, 37 experts completed the elicitation process, resulting in 5-19 answers for each of the identified interactions. Literature suggests that a minimum of four to six experts should be included in an elicitation 73,74 , with empirical evidence suggesting that only minor improvements are gained when having more than 6-12 participants [74][75][76] . For details on the recruitment process and experts' background see Supplementary Table 1 The remote expert elicitation was performed with a web-based application specially developed for this purpose using the 'Shiny' R package 77 . Although it comes with its own challenges related especially to usability and user experience, the benefits of a custom-made application are that it minimizes the amount of material shared with participants and can be fully tailored to a specific task. The web application (available at https://chrysafi1.shinyapps.io/shiny_exp_elic/) displayed everything a participant needed to complete the full elicitation process, consisting of a consent form, background information on the elicitation process, the Earth system processes, the control variables to be assessed, a question example and a dashboard for selecting specific interactions and collecting the inputs.
Experts were asked to evaluate the interactions within the hypothetical area and to elaborate on their thinking process behind the provided answers. A scenario-based approach is particularly constructive when knowledge is undeveloped at a theoretical level 23 , such as in our case, in which exploration of the complex interactions between Earth system processes is still at an early stage. Due to the large number of potential interactions explored and to minimize complexity and time required from the participants to complete the elicitation, we focused only on one scenario in which all Earth system processes are within safe limits (Supplementary Section 2). Additionally, experts had the possibility to set this hypothetical scenario in a region of their choice for their responses.
The questions asked to the experts followed a four-step format. An example question is provided in Supplementary Section 1.2 and all elicitation questions can be explored by accessing the web application. For each interaction, we asked how a change ΔX in the control variable (X) would alter the current level of another control variable (Y). The experts first gave their estimates for the lower and upper plausible values and then their best estimate. In addition, a confidence interval (CI) for the provided estimates was asked. The upper and lower plausible values describe the limits of an expert's CI; for example, assigning a 70% CI means that the expert believes that there is a 70% probability that an interaction strength value would fall within the interval of the upper and lower value, with the best estimate as the most likely value. This format helps experts to construct and convert their knowledge into a quantitative form 19 . Participants were encouraged to provide input only for the interactions they felt best fit their expertise.
Aggregation of expert opinions. Commonly, applications of the IDEA protocol use quantile aggregation in determining the aggregate value of many individual expert responses 19,71 . Although ref. 71 showed that other aggregation methods could lead to less overconfident estimates, they also note that quantile aggregation is simple and informative in providing the best estimate without entailing additional distribution assumptions. As our main results focus on the best estimates, we deemed quantile aggregation an appropriate method to aggregate the expert responses. Before the quantile aggregation, we fit expert responses with a PERT distribution 78 to explore regional divergence among expert opinions. The PERT distribution, which creates a smooth distribution based on three parameters-the lower, upper and best (most likely) value-is a suitable distribution to fit when following the four-step question format that specifically asks for these estimates. All expert answers that did not show substantial regional divergence based on the PERT distribution were then used to perform a quantile aggregation for the lower, best and upper values with a CI of 80% (Supplementary Table 4.2), as described in the IDEA protocol practical guide 19 .
Calibrating the experts would have been a very challenging endeavour, as data for a similar situation from a known system are not available to our knowledge. Additionally, finding a single or few test questions that could fit all expert backgrounds would also have been difficult. One option that we did not want to opt in for was weighting experts by their academic position, as it is a highly ambiguous criterion for performance. Thus, expert opinions were aggregated with an unweighted median of the PERT distribution fitted on the basis of the individual responses. The median was selected instead of the mean to minimize the effect of outliers when equally weighting small groups 19 . Experts could also provide an example region and free-form details in their assessment when quantifying an interaction (Supplementary Section 1.2). This possibility was made available with the initial goal that if sufficient regional input became available, region-specific interaction estimates could be investigated. However, regional inputs were insufficient in number and the results for each interaction are therefore a mix of region/non-region-specific answers representing a world-generic average interaction strength. To consider differences between the non-region-specific and region-specific answers that could lead to lost or skewed information if ignored during aggregation, the following steps were performed for each of the interactions: (1) For each expert, all estimates (best, lower, upper, CI) were standardized to 100% CI with linear extrapolation 67,79 on the basis of the CI they provided. This standardization was performed to enable fitting a PERT distribution. (2) Non-region-specific best, lower and upper values were aggregated with an unweighted median. (3) A PERT distribution was drawn with the 'mc2d' R package 78 for the non-region-specific aggregated values. (4) A PERT distribution was drawn for each non-region and region-specific set of estimates (lower, upper, best). (5) Each non-region-specific distribution was compared to the non-region-specific aggregated distribution with the Kullback-Leibler divergence metric within the 'LaplacesDemon' R package 80 . The 95th percentile of divergence values were used as the limit for aggregation acceptance for the region-specific distributions. (6) Each region-specific distribution was compared to the non-region-specific aggregated distribution with the Kullback-Leibler divergence metric. (7) Region-specific distributions with divergence below the limit for aggregation acceptance were accepted for aggregation. These region-specific responses were thus considered statistically similar enough with the non-region-specific responses and viable to include in aggregation. (8) All non-region-specific and accepted region-specific values (lower, upper and best) were aggregated with an unweighted median.
The final aggregated values were used to estimate the interaction strengths as described in the following section. The region-specific expert responses that were not aggregated due to being statistically too divergent are available in Supplementary Table 3.1. These single-region-specific answers were not sufficient to make robust inferences but, combined with relevant literature or other available data, they could still be useful for other studies.

Estimating interaction strengths between control variables.
To estimate interaction strengths, we first normalized the control variables relative to the theoretical natural state for each of the control variables x = X/X tns (Supplementary Table 1.1 and Supplementary Table 4.1) and then estimated their interaction strength as s = Δy/Δx with the above normalization for every direct interaction between two control variables X→Y, where x is the normalized state, X the current state, X tns is the theoretical natural state for each control variable, Δx is the change in the normalized control variable X and Δy the normalized change caused in Y by the change in X. Only direct interactions were used for the normalization, while indirect interactions were excluded to remove double counting (Supplementary Section 4). With this approach, we quantified the absolute normalized interaction between two control variables. By this normalization, we can better assess the impact of a change in ΔX on Y and how this could contribute to Y more rapidly approaching the outer border of its safe range. The significance of this absolute interaction on how quickly the local safe operating space could be transgressed would depend on local critical limits for each of the variables, which are highly context-specific-for example, ref. 81 describe local critical limits for BI that are variable in different biomes. Thus, further case-specific investigations would be required to evaluate this, which is outside the scope of this article.
The normalization of interactions related to biogeochemical flows posed greater challenges compared to others because of the selected control variable. On the basis of the nitrogen concentration in surface waters that EU member states use to define fair ecological status 82 and the upper critical limit defined by ref. 52 , we used a concentration of 2.5 mg l −1 of dissolved inorganic N as the theoretical natural state used in the normalization. In contrast to the other control variables that move from the safe range towards zero, nitrogen moves outside the safe range from zero to higher values, as nitrogen concentration increases while the other control variables' states decrease (Supplementary Table 4.1). As a result, the interaction strength values of amplifying interactions related to biogeochemical flows are negative as the variables move to opposite directions and values of attenuating interactions are positive as the variables move to the same direction. The contrary is the case for all other control variables as well as for the interactions not involving biogeochemical flows. In the main results, the sign of the interactions is not highlighted but only the direction of either being amplifying or attenuating. Additionally, in the Results, we present the interaction strengths with the aggregated best estimates, while the 80% CI for Δy caused by Δx can be found in Supplementary Table 4

.2.
Limitations of expert knowledge elicitation. Both laypeople and experts are sensitive to subjective biases 83,84 . Moreover, the reliability of expert judgements depends on who participates and how questions are posed 19 . For this expert elicitation, we invited leading experts within the planetary boundaries framework and whose judgement was supported by authorship of relevant scientific publications. Despite the limitations of such non-model-based approaches, expert opinions are valuable 85 , especially in this case where modelling capacity is currently too limited to handle all the complexity of the Earth system 7,86 ; expert opinions are thus necessary to advise on such critical matters 21,22 . Formal structured elicitation protocols, such as the one used here, have been developed to minimize the limitations and associated biases of expert judgements 65,66,87 . Even though a longer elicitation process is associated with declining quantity and quality of information, it appears that the benefits of following a structured protocol outweigh the potential drawbacks 87 . Finally, even though our expert-elicited interaction strengths would have benefited from an increased number of responses for certain interactions-as a larger number of responses is generally associated with less bias when aggregating 66 -we can still be confident in our assessment. This is due to the high agreement among experts after the second elicitation round (Supplementary Table 5.2) and the empirical evidence suggesting that only minor improvements are gained by having more than 6-12 participants 19,[74][75][76] .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All the data produced in this study are available in the Supplementary Information and Supplementary Data and provided with the submission.

Code availability
The analysis was performed using R studio (R v.4.0.5). The code is available upon request from the first author.