Modelling regional futures at decadal scale: application to the Kimberley region

We address the question of how to provide meaningful scientific information to support environmental decision making at the regional scale and at the temporal scale of several decades in a network of marine parks in the Kimberley region of Western Australia. Where environmental sustainability is affected by slow-dynamics climate change processes and one-off investments in large infrastructure which can affect a region for decades to come, both strategic and reactive planning is necessary and prediction becomes as urgent as standard adaptive management. At the interface between future studies, socio-economic modelling and environmental modelling, we define 18 scenarios of economic development and climate change impacts and five management strategies. We explore these potential futures using coupled models of terrestrial and marine ecosystem dynamics. We obtain a projection of the Kimberley marine system to the year 2050, conditional on the chosen scenarios and management strategies. Our results suggest that climate change, not economic development, is the largest factor affecting the future of marine ecosystems in the Kimberley region, with site-attached species such as reef fish at greatest risk. These same species also benefit most from more stringent management strategies, especially expansion of sanctuary zones and Marine Protected Areas.

Oceanic Shoals Marine Park and Joseph Bonaparte Gulf Marine Park (part of the Northern Territories Network of Marine Parks). More details of zoning and managing activities can be found in 1 .

EwE functional groups and their variation of input biomass data
Supplementary Table S 1. Ecopath with Ecosim (EwE) functional groups (first column) and variation of their input biomass based on changes of habitat, distribution, and productivity predicted by climate models, acidification processes, sea level rise and ALCES model. Whenever possible references are included in the table. In Ecosim (a temporal dynamic ecosystem model), we used different environmental and habitat parameters to simulate the predicted ALCES changes in estuaries, wetlands, and sediment yield transported by rivers to the ocean. We assumed that these changes in habitat quality and cover area could result in proportional effects on natural mortality, vulnerability to predation and relative feeding rates of species associated directly with the habitats affected. In the case of the climate change impacts, we used three levels of RCP emissions (see main document) to force the input biomass of target species based on the predicted trajectories of Australian fish stocks 2  . Only indicators whose final state change more than 40% compared to the mean of all scenarios are included in this plot.

Results of bootstrap analysis supporting importance of climate change over development
The cluster analysis of the scenario results suggests that climate change affects the model output more than the development scenarios. To assess the statistical significance of this results, we would need a much larger set of simulations than we can afford to run. As an alternative approach, we used bootstrapping ( 11-13 ) to generate a large number of surrogate EwE results and compared the output of the 18 scenarios so far discussed against this set of surrogate results. In particular: 1) We concatenated the biomasses of all living species (57) at the end of the simulation for each scenario, into a 18*57 matrix (one scenario per row, one species per column) 2) We generated a large (10^4) set of surrogate biomasses distributions. Each surrogate biomass distribution was generated by choosing, per each of the 57 columns (species), a random row and assigning the biomass value at that row to that species. This results in a 'realistic' set of biomasses values (since the value are the output of actual EwE runs), in which the relation between species biomass, as imposed by EwE dynamics, is lost. 3) For each of the (10^4) set of surrogate biomasses distributions, we compute the Total Divergence against a baseline EwE run. This results in a random distribution of 10^4 Total Divergence values. 4) Next, we compute the Total Divergence against a baseline EwE run for each of the 18 scenarios under analysis. 5) Finally, we add these 18 Total Divergence to the random distribution values and we sort the full set of 10018 Total Divergence in increasing order. The position of the Total Divergence of the 18 scenarios in this sorted list is a proxy for the statistical significance 11 , that is, the closer to the extremes of the distribution these scenarios are, the more they statistically differ from the full set.
Four of the six high climate scenarios and two of the low climate scenarios are significantly different from the overall distribution, with a p<0.05, while five of the six high climate scenarios and four of the low climate scenarios are significantly different from the overall distribution, with a p<0.1, where the high and low climate scenarios are at opposite extremes of the distribution.

Biomass changes for all indicators for each management strategy
Supplementary Figure

Total Divergence
To cluster the 18 modelled scenarios for climate change and human development into natural groupings, we used the Total Divergence as a measure of scenario similarity. We need a measure which captures both changes in how the biomass is distributed between species as well as changes in total biomass.
Let's consider an ecological system made up of n functional groups. Each group has biomass Bi, i=1..n. At each time step t, we represent the system by its biomass distribution . 1 Next, we normalise the biomass distribution = ∑ . bi can be understood as a probability distribution, that is, as the probability that a random unit of biomass in the ecosystem belongs to group i. Let's assume bi is the output of an ecological model. At the beginning of the simulation (t=t0) we have 0 . After running the model for ∆ , the output biomass is 1 . We want to define a measure which reflects how much the system has changed during ∆ = 1 − 0 . A number of measures are available to compute difference between statistical distributions. An often used measure in information theory is the Kullback-Leibler divergence 15 , DKL is a weighed entropy and can be interpreted in several ways. One interpretation pertinent to the management of natural resources is that it represents the loss of information in studying the system via 0 when the system has changed to state 1 . Equivalently, it can be interpreted as the information gain in realising that the system has changed to 1 .
Nevertheless, the Kullback-Leibler divergence has a number of drawbacks which hamper its use in our application. First, it is not symmetric, ( 0 , 1 ) ≠ ( 1 , 0 ) and as a result it does not define a proper distance metric. It also means that the divergence between 2 states depends on which one is seen first (although this can be make symmetric by replacing it with ( 0 , 1 )+ ( 1 , 0 ) 2 16 ). A more serious drawback is that ( 0 , 1 ) = ∞ if any 0 = 0, which imply that there is no upper bound for ( 0 , 1 ). It also means that the symmetric version is not computable if either any 0 = 0 or any 1 = 0. To address these drawbacks, a number of alternative measures are reviewed in [16][17][18] . A particularly convenient option for our purpose is the Hellinger distance: The Hellinger distance circumvents both drawbacks for the Kullback-Leibler divergence. Furthermore its maximum value is equal to one when 0 has probability zero for all i for which 1 has non-zero probability and vice versa. As we will see, this property is particular convent in our approach. As a result, in the rest of the discussion, we will use (2) in place of (1) 2 .
Eq 2 measures how the biomass distribution at time t1 changes relative to what it was at time t0. As such, it can be understood as a measure of change in the ecosystem biodiversity. However, it says nothing of whether the total biomass ∑ has changed. For example, if the biomass for each group decreases by 50%, ( 0 , 1 ) = 0. To account for changes in total biomass we add a virtual group to our ecosystem. We call this group 'LostBiomass' and we define it as +1 = ∑ 1 -∑ 1 , where represents the system B in an ideal unperturbed, pristine state. At each time t, LostBiomass includes the biomass not belonging to any group i=1..n. This can be interpreted as the biomass lost to the system. Because bi, i=1..n+1 must be a probability distribution, we must ensure that +1 ≥ 0. We achieve this by assuming that the pristine system has higher total biomass that the perturbed states B t . In practice, we do not need the full distribution , but only the total biomass = ∑ 1 , which is constant, and then define +1 = − ∑ 1 .

Sensitivity analysis of the Ewe marine model
The Kimberley EwE model resulting from the mass-balancing process represents only one possible realization of this ecosystem. Model sensitivity to Ecopath input parameters (B, Q/B, P/B and diet) was carried out via Monte Carlo simulations in Ecosim (the time-dynamic model). The pedigree routine in Ecosim (see details in 20 ) assesses the reliability of these input parameters based on data type and origin and determines the confidence of variation (CV) used in the Monte Carlo approach. This included 500 simulation runs of the high climate impact, high development, dry scenario, with parameters chosen from normal distributions centered on the initial input parameter estimates and CV determined by the model pedigree. This approach allowed us to identify the groups whose biomass changes the most as a function of input parameters and CV. In deceasing order, these are seagrass, phytoplankton, salps and jellyfish, zooplankton and macrophytes (Supplementary Figure S 5, left panel, also highlighted in yellow in the EwE foodweb). The Monte Carlo runs also provide indications about the overall trophic control in the system and suggest that the Kimberley marine food web is associated with bottom-up control because of larger sensitivity to changes in the biomass of the lower section of the food web (functional groups within trophic levels of 1 to 2.5).
In addition, the quality of the input data was also assessed in the mass-balancing of the EwE model. We computed the Ecotrophic Efficiency (EE), a measure of the proportion of production that is utilized in the system by predation or fishing (EE cannot exceed 1.0, that is, it is not possible to consume more than is produced). As a general rule, EE values near 1.0 are expected for groups whose production is consumed by predators or removed by the fishery while values near to 0.0 are expected for groups with low predation rates (e.g. top predators like sharks) or not targeted by a fishery. Checks on EE values, and thermodynamic and ecological rules of thumb 20 (i.e. slopes of biomass ratios, total production, mortality rates, and values of Ecotrophy Efficiency) are used to highlight groups in the pre-balanced model whose input parameters need to be adjusted within biologically plausible limits to achieve mass-balance of the flows in the food-web. We reduced predation mortality rates of groups with unrealistic EEs (juveniles of Lethirnids, pelagic fish, mud crabs, squids, deep demersal fishes, turtle hatchlings, Banana prawns, Large reef fishes among others) by identifying their main predators and adjusting diet matrices until EEs were <1.0. This diagnostic approach highlighted the groups with more reliable data and therefore whose model predictions have lower uncertainty. The two approaches adopted in this study, provided both an assessment of the effect of uncertainty in Ecopath input data as well as an identification of knowledge gaps of the foodweb elements.
Supplementary Figure  Box 1 represents the first step involving the problem formulation. This was carried out in collaboration with an initial set of stakeholders and led to agreement on the pressures and drivers of change expected to have the highest impact on the future of the Kimberley region. In turn, this led to an assessment of the level of uncertainty on the state of the system and on the processes acting on it and a discussion on the kind of levers and control available to the management authorities. This step also involved the identification of a larger set of stakeholders including institutes and researchers able to provide data for model parametrisation. The stakeholder team included both stakeholders familiar with modelling and the modelling team as well as stakeholders who had never interacted before this project.
Reaching a shared understanding on the most significant pressures, drivers of change and uncertainties was also necessary in order to select the appropriate temporal and spatial scale for the analysis as well as to realise that adaptive management could not be used as the only approach to this task, as discussed in the Introduction in the main document. The acknowledgment that the problem would benefit from being cast within a Future Studies approach led to the definition of the set of scenarios as discussed in Section 4.2 in the main document (Box 2). With a definition of the purpose, temporal and spatial scale of the analysis and the scenario to model, the team progressed to the model implementation, parameterisation and calibration (Box 3) (see Section 4.1 in the main document).
The initial set of model simulations allowed the team to identify climate as the main driver of change in the system and to select a subset of scenarios for further analysis (Box 4). At this stage, interaction with stakeholders helped identify a set of strategies available to manage the network of marine parks (Section 4.3 in the main document) (Box 6) for further model simulation. In the figure, blue boxes represent activities involving stakeholder input and interaction, while the red boxes represent activities mostly involving computer modelling. As can be seen, closed interaction with different stakeholder teams was carried out at all stages of the project.
Supplementary Figure S 6 also helps us obtain a rough picture of the relation between adaptive management and Future studies approaches as carried out in the project. While the project was carried out as a whole, and each activities benefitted by the team's appreciation of both approaches, it is easy to recognise the adaptive management/Management Strategy Evaluation (MSE) 21,22 cycle on the right (Boxes 4, 5 and 6) and the problem formulation following a Future studies approach on the left (Box 1 and 2).
Supplementary Figure S 6. Flowchart of the process we have used to carry out this research. Blue boxes represent activities involving stakeholder input and interaction. Red boxes represent activities mostly involving computer modelling.