Five key attributes can increase marine protected areas performance for small-scale fisheries management

Marine protected areas (MPAs) have largely proven to be effective tools for conserving marine ecosystem, while socio-economic benefits generated by MPAs to fisheries are still under debate. Many MPAs embed a no-take zone, aiming to preserve natural populations and ecosystems, within a buffer zone where potentially sustainable activities are allowed. Small-scale fisheries (SSF) within buffer zones can be highly beneficial by promoting local socio-economies. However, guidelines to successfully manage SSFs within MPAs, ensuring both conservation and fisheries goals, and reaching a win-win scenario, are largely unavailable. From the peer-reviewed literature, grey-literature and interviews, we assembled a unique database of ecological, social and economic attributes of SSF in 25 Mediterranean MPAs. Using random forest with Boruta algorithm we identified a set of attributes determining successful SSFs management within MPAs. We show that fish stocks are healthier, fishermen incomes are higher and the social acceptance of management practices is fostered if five attributes are present (i.e. high MPA enforcement, presence of a management plan, fishermen engagement in MPA management, fishermen representative in the MPA board, and promotion of sustainable fishing). These findings are pivotal to Mediterranean coastal communities so they can achieve conservation goals while allowing for profitable exploitation of fisheries resources.


Data gathering procedure
In order to compile information for each MPA, we used multiple sources: 1) first we e-mailed questionnaires designed to obtain specific data from each MPA on management of SSF (e.g. the number of boats allowed to fishing in the MPA, gears allowed, regulation, involvement of fishermen in the management process) to MPAs managers and scientists. Within the questionnaire we used forty-eight questions to collect information about SSF management in each MPA, together with the related ecological and socio-economic effectiveness evidences. In this way, managers/scientists provided us with information that we would otherwise not have been able to get, compiling an unprecedented dataset for the Mediterranean. Direct contacts via e-mail and/or phone calls with MPAs' managers and scientists has been also carried out to refine responses from questionnaires. Due to the very patchy nature of small-scale fishermen communities in the Mediterranean Sea and the general lack of associations (and therefore representatives) grouping fishermen at a small spatial scale (e.g. fishermen operating within and/or close to an MPA) we were not able, on such a large scale, to interview fishermen about the issues investigated. In that perspective it could be argued that we collected information from a narrow range of informant types (i.e. MPA managers and scientists), however it has to be stressed that in the questionnaire we did not ask for respondents' perception but only for official and factual information (e.g. number of fines, management procedures adopted to engage fishers into management). In addition, for what concerns the level of fishermen engagement into SSF management, a variable that could be prone to perception bias, we performed fact-checking in a subset of MPAs where we have long-lasting collaboration with both managers and small-scale fishermen (i.e. Torre Guaceto, Scandola, Tavolara, Portofino). In all the 4 case studies we detected a match between information provided by managers and both fishermen perspective and our perception on the field. We can therefore reasonably consider that our database is free from bias related to partial perceptions, and all the issue related to different perspective among stakeholders go beyond the scope of the present work.
2) we also reviewed scientific literature obtained through a comprehensive search of various electronic library databases (Web of Science, Google Scholar, Scopus) using key words (different combinations of: fish*, fisher*, CPUE, catch*, small scale, coastal, artisanal, manag*, outcome*, benefit*, socio*, econom*, and income*) and the name of each MPA and following up references therein, with an additional search performed on Google; We expanded our review to 3) studies published on a national/local level, and on conference proceedings. As such studies are often unavailable on-line and unreported in scientific databases, we searched these sources in our personal archives or explicitly asked colleagues working in other scientific institutions, NGOs, MPAs, etc., and to 4) grey literature, unpublished studies (e.g. project report) carried out by the MPAs' management bodies and directly provided by the MPAs. This procedure also allowed us to account for other documents unavailable from more conventional sources.
Manuscripts in different languages (i.e., English, Spanish, French, Italian and Croatian) were reviewed and processed. Literature search was performed in October 2014. Papers and other documents published later than this date were not considered in our analysis.
Thirty-four MPAs replied to the questionnaire and were therefore retained in our database. All the other MPAs were discarded due to lack of crucial information.

Data gathering procedure
For both ecological effectiveness and increase in fishermen incomes, positive effect of an MPA was related to a score of 1 (presence). Absence or evidence of positive effects of MPAs were assessed as a result of the implementation of management (before-after analysis) or when comparisons were available between MPAs and surrounding unprotected areas (control-impact analysis). When information about fishermen incomes was missing we used CPUE or CPUA (catch per unit of effort or area, depending on information availability from each MPA) as a proxy of fishery benefits (higher catches or incomes) assuming as constant the 'fish price' between each MPA and its unprotected "control" area (or along the time), and assuming the fishermen incomes related to the amount of fish collected. In this study, we have also assumed the cost of fishing as constant between each MPA and its unprotected "control" area (or across time). These assumptions make thus possible to provide a simplified 'fishing income' estimate, as this latter is related not only to the amount of fish collected, but also to a number of other variables, like e.g. the fishing operational costs (such as fuel, personnel), the specific techniques used and the location of fishing areas.
Revenues are related also to species composition (being some species much more valuable than others) and fish size (as bigger fish are more valuable than small ones). However, as a general rule, the most valuable species are the ones that are most targeted by fishing and also the ones more benefiting (in terms of increased density and size) from MPA protection 36,38,55 . From this perspective, an equal-weight catch coming from an ecologically effective MPA (composed by more commercially valuable fish of larger size) could be more valuable than a catch coming from an unprotected area. Accordingly, in this study the positive effect of an MPA on fishermen incomes could be underestimated, but the general lack of detailed data makes it impossible to provide more in-depth analyses on fishermen incomes in MPAs.
Fishermen environmental commitment was defined by looking at 1) compliance with MPA rules considering if authorized fishermen to operate within the MPA boundaries committed infractions for illegal fishing, and 2) fishermen participation in research/environmental programs developed in the local MPAs considering if multiple fishermen systematically participated to these activities. If fishermen did not committed any infractions and were engaged in research/environmental programs a score of 1 was assigned to the MPA.
Fishermen environmental commitment was identified as an outcome in order to capture the social dimension of the complex socio-ecological system represented by SSF management in MPAs and to include it in the building of the overall success score. This outcome is a proxy of fishermen commitment toward MPA management goals and, on the other hand, of social conflicts/friction of fishermen versus the management body (that represent a concrete hurdle to fishery management in MPAs 19 ); in this perspective a successful management should target to reduce/annul these conflicts and foster fishermen commitment in the same way as it should target economic benefits for fishermen and ecological benefits in terms of increased fish density/biomass.

Random forests optimization
The random forest algorithm implemented in the R package randomForest 58 has notably 3 hyperparameters known to affect RF model predictive accuracy and attribute importance estimates: 1) ntree, the number of trees grown, 2) mtry, the number of attribute randomly selected when growing one tree, and 3) nodesize, the minimum size of terminal nodes. Therefore it is crucial to tune the 3 hyperparameters in order to optimize the RF model. To do so we adopted the following steps: 1) assess the minimal ntree (ntree m ) so that the out-of bag error-rate stabilize at low value using default mtry and nodesize (i.e. 4 and 1 respectively) and select an ntree (ntree o ) higher that ntree m because this will not cause RF to overfit 33 ; 2) find the best combination of the hyperparameters mtry and nodesize (i.e. the combination that minimizes the mean error-rate) by using a grid search of 9 x 10 combinations of hyperparameters (i.e. the mtry values tested were from 1 to 9, the nodesize values tested were from 1 to 10) by fitting 100 RF for each combination of hyperparameters and using the mean out-of bag error-rate over the 100 runs; 3) check if the ntree o selected after step 1 (i.e. ntree=1000) is still large enough to stabilize the out-of bag error-rate when using the best combination of mtry and nodesize found in step 2. These steps were run for each of the 4 outcomes (i.e. OMS, ecological effectiveness, fishermen incomes, fishermen compliance and commitment) separately (see Supplementary Fig. S2-S5 for details about the 3 steps for each of the outcomes).
In order to perform an additional check of the sensitivity of Boruta outputs to different hyperparameters selection (i.e. fine tuning in mtry and nodesize selection), we qualitatively compared outputs of Boruta models fitted with the 3 best combinations of hyperparameters identified by using the above mentioned grids search.
For none of the 4 outcomes considered we detected a relevant variability among the outputs of the random forests models with the 3 best combinations of hyperparameters ( Supplementary Fig. 6), with this finding highlighting the robustness of RF and Boruta to fine tuning in hyperparameters selection.

Random forests and Boruta sensitivity to the number of MPAs considered
In order to assess if our sample size (i.e. n = the number of marine protected areas included in the present study) represented a replication level adequate to provide reliable estimation of the relevance of the attributes considered in determining overall success, we assessed the sensitivity of the RF model to the n considered. Specifically we implemented in R a script to repeat 1000 times (independently) random sampling without replacement of n i MPAs (a subset of all the 25 MPAs considered, with 15≤n i ≤25), and compute Boruta algorithm using ntree=1000, mtry=2 and nodesize=7, (i.e. the best combination of hyperparameters found for the whole dataset). For each Boruta run, the decision relative to each attribute (confirmed, tentative, or rejected) was stored.
Finally for each attribute at each n i , the proportion of Boruta decision over the 1000 repetitions was computed ( Supplementary Fig. 7).

Random forests and Boruta sensitivity to possible miscoding in MPA success score estimation
We assessed the sensitivity of Boruta outputs to potential miscoding of the three outcomes (ecological effectiveness, fishermen incomes and fishermen environmental commitment) used to build the overall success score. Miscoding could be due to incorrect information reported in the documents (i.e. papers/reports) we used to code each variable for each MPA, or to involuntary exclusion of useful reports/papers particularly difficult to find through our bibliographic search and unknown by MPA managers. Despite high unlikely (due to our extensive literature search and the large use of peer-reviewed papers) this potential bias cannot be excluded. In this perspective we implemented in R a script to repeat 1000 times (independently) random forests and Boruta with the response variable (i.e. overall success score) modified by randomly summing or subtracting 1 at an increasing number of MPAs (n i ) in our dataset. We used ntree=1000, mtry=2 and nodesize=7, (i.e. the best combination of hyperparameters found for the whole dataset). For each Boruta run, the decision relative to each attribute (confirmed, tentative, or rejected) was stored.
Finally for each attribute at each n i , the proportion of Boruta decision over the 1000 repetitions was computed ( Supplementary Fig. 8).
Boruta outputs were proven to be highly robust to moderate variation in overall success score (i.e. summing or subtracting 1) especially for three attributes: enforcement, level of fishermen engagement and presence of a management plan. For this attributes an outcome should have been miscoded in at least 6-10 (depending on the attributes, see Supplementary Table 4) in order to obtain Boruta outputs deviating from the ones we obtained in the present study.
We assessed also the effect of potential more drastic miscoding by implementing a script similar to the one described above but modifying the overall success score by randomly summing or subtracting 1 or 2 at an increasing number of MPAs (Supplementary Fig. 8).Also in this case the output was robust for 5 of the 6 significant attributes, and especially for enforcement, level of fishermen engagement and presence of a management plan (Supplementary Table 4). The outputs of these analyses highlight that our results are highly robust to moderate miscoding and are robust even to more drastic miscoding.