## Introduction

Land-use change is a key threat to the conservation of biodiversity, ecosystems1, and the services they provide globally2,3, and migratory species are particularly vulnerable to such change given the vast geographic areas they occupy over the annual cycle4,5. Indeed, a recent global assessment indicated that protected areas adequately protect the ranges of just 9% of migratory bird species5. Strategic approaches to identify and conserve habitats critical to the persistence of migratory species are therefore sorely needed.

Unfortunately, substantial gaps in knowledge of the abundance, distribution, and demography of most migratory species6 have hampered strategic planning and led to uncertainty about the optimal allocation of conservation effort5,7. Given that populations of many migratory species continue to decline4,8, there is an urgent need to identify portfolios of lands critical to the persistence of target species, and amenable to management in support of species conservation without compromising human well-being.

Multi-species decision support tools can facilitate the identification of areas crucial to the conservation of migratory species, but have remained intractable due to limits on knowledge and computing power. We capitalized on advances in models of bird species abundance and distribution using crowd-sourced data9,10 and linear programing techniques11 to develop a robust multi-species planning tool to estimate the land area needed to conserve 117 Nearctic-Neotropical migratory songbirds throughout the annual cycle (Supplementary Data 1). Specifically, we combined fine-scale, predictive models of distribution and abundance estimated weekly throughout the year with spatial optimization techniques12 to identify the amount and type of land needed to reach our conservation targets given alternative planning scenarios at hemispheric scales.

We first estimated the abundance and distribution of 117 migratory bird species weekly, using spatiotemporal exploratory models9,13 to calculate the relative abundance of each species throughout the annual cycle (Supplementary Movie 1). Incorporating information for each week of the year is especially important for migratory species, as this reflects their movements throughout the annual cycle and allows more precise estimates of their population distributions in space and time. We next recorded and compared the geographic area requirements and land cover types selected when optimizing for each week of the year independently and summing the total area over all weeks (hereafter, ‘weekly’), versus optimizing over the entire year at once (hereafter, ‘yearly’). Weekly optimizations for area efficiencies were developed to identify species-specific priorities for species at fine enough scales to capture short-term stopover sites. Our yearly approach optimized efficiency over the full annual cycle of each species, and emphasized temporal consistency in abundance hotspots more likely to reflect breeding and non-breeding regions. Because all existing conservation plans consider stationary phases of the breeding and non-breeding periods separately14,15, our analysis is the first example of spatial optimization scenarios which track populations over their full annual cycle.

We next created area-optimized solutions designed to conserve lands used by 30% of the global populations of all 117 species in each of 52 weeks by sampling species (a) over their entire range, without accounting for spatial clustering of species abundance, or (b) by sampling within 5 regional population clusters, identified weekly to accommodate spatial clustering in species abundance. We wanted to account for the spatial clustering of species abundance because broadly distributed species often exhibit strong regional-scale variation in abundance across their range. Regional-scale variation in species’ abundance may reflect a number of important processes affecting the ecology and conservation of species, from variation in resource availability and land-use patterns to population-structure related to movement and migratory connectivity. By accounting for the spatial clustering of species abundance, the prioritization is stratified over multiple regions to ensure adequate protection over the entire species range. Because spatial clustering in species abundance—let alone its consequences for movement or connectivity—is poorly understood in most migratory species16, we developed an innovative approach to account for structure statistically. Specifically, we used cluster analysis to delineate abundances into 5 spatial clusters of equal abundance and stratified our weekly sampling among clusters to capture the full geographic distribution of each species. Our use of five clusters is an example that minimized computational effort, but which could be extended to optimize by species or goal. Our 30% target is also arbitrary, but intermediate to the 17% of terrestrial ecosystems targeted by the Convention on Biodiversity17 and 50% targets suggested by comparative analysis18, and can be modified to reflect strategic goals19.

Last, we compared area-based conservation plans designed to represent different perspectives about the potential contribution of human-modified lands to the conservation of migratory birds. Our ‘intact habitat’ approach emphasized the protection of relatively intact habitat as indicated by a low human footprint index20 (Supplementary Fig. 1), whereas our ‘shared-use’ approach permitted the inclusion of landscapes converted to more intensive use by humans21,22. Our scenarios, termed intact habitat and permissive of shared-use, are analogous, but more general, than land sparing and land sharing scenarios. Exploring such constraints represents a critical step in conservation planning, given that human cultural history, values, and well-being can all affect conservation success and represent critical inputs into structured decisions about the most efficacious actions23,24,25.

## Results

### Weekly vs. yearly approaches

The land area required to achieve yearly conservation targets was 56% less on average than when area targets were summed across weeks (range = 49 to 65%; Table 1). Yearly solutions required relatively less land area than weekly approaches in shared-use scenarios (62% less) than in intact habitat scenarios (49% less, Table 1, Figs. 1, 2). Area reductions under yearly planning generally resulted from cases such as the inclusion of sites used by more than one species across two or more weeks of the annual cycle. A likely explanation for this difference is that the yearly approach will select sites that are used for longer periods of the annual cycle, and that there may be greater overlap of those areas between species than occurs in the short-term stopover sites included in the weekly approach, hence the larger area needed under the latter.

### Single populations vs. spatial clustering

As expected, the area required to reach our conservation targets increased when we accommodated spatial clustering of species abundance, although relatively less so under a intact habitat (13% increase) compared to a shared-use (26% increase) scenario (Table 1, Figs. 1, 2). This reduction occurred in part because the homogenous cost structure used in our intact habitat scenario was less influential on site selection than was the heterogenous cost structure used in the shared-use scenario. Although we currently lack empirical data with which these spatial clusters can be validated, our predictions can be tested directly as tracking and genetic mapping techniques improve to allow comparisons of observed and predicted migration routes. That being noted, our current method provides a useful approach to ensure geographic representation of spatial clustering of species abundance of a broad suite of species using publicly-available citizen science data in spatial planning tools.

Many conservation interventions, including land protection, are constrained by limits on fiscal or human resources and the opportunity costs of development. Our results show that sampling populations across the species range each week required almost twice the amount of land compared to yearly plans based on the relative abundance of species. Our work thus demonstrates the daunting problem of conserving sufficient land area such that migratory species’ dynamic populations are protected throughout the year5,7,10.

### Intact habitat vs. shared-use

Another key result of our work is that incorporating conservation objectives in human-dominated habitats may dramatically improve the efficiency of conservation area designs for migrants if their demographic performance is similar in ‘working’ and ‘intact’ landscapes. Under the yearly scenario we found that shared-use approaches required 26 and 18% less land area, respectively, than intact habitat approaches including or ignoring spatial clustering of species abundance (Table 1). We also found that intact habitat approaches selected different geographic areas and ecosystems than shared-use approaches. Most notably, intact habitat approaches selected larger areas of needle-leaved forest in boreal and mountainous zones of western Canada, and more broad-leaved evergreen forest in the eastern Andes and western Amazon basin (Figs. 1, 2; Table 2). Our findings thus add to a growing body of literature indicating the need to broaden the lens through which we view conservation to both accommodate human livelihoods and conserve species23,24,25.

### Overarching considerations

Our findings suggest a need to re-evaluate conservation planning processes that are based on less precise methods. For example, government and non-governmental organizations allocate up to \$1 billion annually to bird conservation based on aspatial targets and expert elicitation, with most actions directed to breeding habitat14,15. Our results suggest an alternative approach with a potential to meet conservation targets at lower land management cost and more compatible with human-dominated lands that can serve the dual purpose of supporting migratory species and human livelihoods. Overall, our results also illustrate potential trade-offs that conservation practitioners considering optimized portfolios must consider as additional targets and constraints are identified and incorporated in higher-level management models25. Even without consensus among conservation practitioners on focal scenarios, a considerable area of land was selected in at least six of the eight scenarios explored here, illustrating that many priority areas meet a wide range of perspectives encapsulated in our scenarios (126, 000 km^2, Fig. 3). This indicates that even where approaches differ, conservation practitioners can use tools like the one described here to build consensus in hemispheric conservation efforts.

Several additional caveats arise from our results, particularly with respect to the shared-use and intact habitat scenarios. Implementing conservation action in working landscapes may be more challenging than in areas with less human activity if the opportunity costs of management are higher. For example, even if identified as a high-priority site for conservation in our shared-use scenarios, land already converted to human use may be more vulnerable to degradation in the future than more intact areas26. Such habitat degradation, especially if combined with other anthropogenic stressors that may directly or indirectly reduce survival or performance of wildlife27, could make it difficult to reach population goals for species even if area needs are lower compared to less developed landscapes. In practice, both approaches are likely to be used given that target species will differ in their reliance on more or less developed habitat types28. Our prioritization scenarios provide planners with guidance on the approximate locations and requirements for land needed to meet our stated targets under a range of scenarios. With such portfolios in hand, planners can then more readily assess the cost-effectiveness of alternate approaches to land management and socio-economic policies most favorable to conservation and human well-being23,24,25.

Additional species or landscape characteristics might also be considered in prioritization, depending on the mandate of conservation agencies. For example, the threat status of species has been incorporated into spatial planning recommendations29,30. Our analysis included 19 red or yellow-listed species by Partners-in-Flight (Supplementary Data 1). In addition, the specific life-history characteristics and habitat requirements of species, where available, can be important considerations. The 117 species in our analysis varied in spatiotemporal patterns of migration, habitat use, abundance, and population abundance. Although we did not address how variation in life history among species might influence results, it is clear that as high-resolution distribution models become available for a larger range of migratory species, such as waterfowl, shorebirds, and marine, forest or grassland specialists, planners will have more opportunities to filter species or emphasize particular guilds to represent particular ecosystems or habitats. Different representation targets are also possible. We emphasize that the 30% target used here is illustrative only. In some cases, higher targets may be needed to avoid range contraction or the local extinction of sub-populations, to conserve ecological function such as seed dispersal or pest control31, or to maintain the evolutionary potential of locally-adapted populations32,33. Nevertheless, our 30% target returned solutions in all cases which vastly exceed the areal extent of existing conservation plans in support of Neotropical migrant birds.

## Discussion

Ongoing declines in the abundance and distribution of many migratory species amid severe constraints on financial and human resources34 point to an urgent need for area-based plans that achieve conservation targets while minimizing the opportunity costs of land conservation and impacts on human livelihood23,24,25,35. Our solutions minimized the total land area prioritized for conservation to provide an area-efficient portfolio of lands for further consideration by conservation planners. Four key lessons can be derived from our results. First, scenarios based on the distributions of abundance of all 117 species that were integrated over the entire year required less land area to meet conservation targets than scenarios based on optimizations that used the weekly distributions of those species throughout the year. This suggests difficult trade-offs between finer-scale spatiotemporal representation of habitats throughout the year, versus solutions that achieve the best compromise on population targets across the entire year. The latter appears more area-efficient, but may miss key stopover points along migratory routes. Second, accounting for spatial clustering of species abundance through stratified sampling across the entire distribution of species increased the total land area required to achieve conservation targets. Despite requiring more land area, ensuring geographic representation may be necessary to the long-term persistence of species, particularly in widely-distributed species with population genetic structure potentially reflecting adaptation to local conditions32,33. Third, area-based plans that accommodated human activity (shared-use) were generally more efficient than intact habitat approaches that avoided areas with a high human footprint. However, because migrants vary spatially and temporally in their tolerance of human-impacted landscapes36, achieving conservation goals will likely require a portfolio of sites located in both intact and disturbed landscapes. Fourth, although our planning scenarios focused on Neotropical migratory birds, our approach could be easily adjusted and replicated in other migratory species and systems with sufficient data. In the case of birds, citizen science data and advanced prioritization tools allowed us to reveal marked efficiencies in area-based plans spanning the full annual cycle and multiple jurisdictions to conserve 117 individual species simultaneously.

## Methods

### Species selection

We used the eBird citizen-science database (Sullivan et al., 2014) for this analysis. A total of 224 species were available and we identified a subset of these for analysis using the following procedure. We first examined annual eBird distribution maps for all species to identify Neotropical migratory species (n = 181 species), defined as those with breeding ranges in North America and non-breeding ranges that extend south of the Tropic of Cancer37. We then selected terrestrial passerines from this initial group (n = 117 species, see Supporting Information Data 1). These 117 species fell into two broad groups based on their breeding and stationary non-breeding ranges: (1) species where individuals breed in North America north of the US-Mexico border and migrate south of the Tropic of Cancer during the non-breeding period (n = 101 species, Supplementary Data 1), and (2) species with both migratory and resident populations or subspecies, for which individuals from migratory populations north of the US-Mexico border move south of the Tropic of Cancer during the non-breeding period (n = 16 species). Both migratory and resident populations were included in our analysis.

### Approaches to conservation prioritization

We created 8 planning scenarios using weekly STEM models for each of 117 focal species and incorporating different assumptions about temporal scale and cost metrics employed in prioritization. First, we created area-optimized solutions to conserve 30% of the global populations of all species by optimizing during each week of the year separately (i.e., conserving 30% of populations in each week and summing these individual solutions across the year) versus over the entire year (i.e., conserving 30% of total populations calculated once throughout the year, see below and Supplementary Fig. 4 for details). We next created area-optimized solutions to conserve 30% of the global populations of all species in each week by sampling each species a) over their entire range, without accounting for spatial clustering of species abundance, or b) as 5 regional population clusters identified weekly to accommodate spatial clustering of species abundance and migratory connectivity. Third, we compared area-based conservation plans designed to represent different perspectives about the potential contribution of human-modified landscapes to the conservation of migratory birds, while including either the unrestricted cost metric (each planning unit having the same cost of 1) or the human footprint cost metric (the planning unit cost equals the 2009, 1 km resolution human footprint metric by Venter et al.20 to identify areas more and less subject to human use), to create a total of 8 scenarios (Supplementary Fig. 4). We used the prioritzr38 R package for the analysis, which interfaces with the Gurobi39 optimization software.

### Spatial prioritization approach

Here we use the concept of systematic conservation planning40, to inform choices about areas to protect, in order to optimize outcomes for biodiversity while minimizing societal costs41. To achieve the goal to optimize the trade-off between conservation benefit and socioeconomic cost, i.e., to get the most benefit for limited conservation funds, we strive to minimize an objective function over a set of decision variables, subject to a series of constraints. Integer linear programming (ILP) is the subset of optimization algorithms used here to solve reserve design problems. The general form of an ILP problem can be expressed in matrix notation as:

$${\mathrm{Minimize}}\,cx\,{\mathrm{subject}}\,{\mathrm{to}}\,Ax \ge b$$
(1)

where x is a vector of decision variables (in our case, whether to prioritize an individual planning unit), c and b are vectors of known coefficients, and A is the constraint matrix. In the minimum set cover problem, c is a vector of costs for each planning unit, b a vector of targets for each conservation feature, the relational operator would be ≥for all features, and A is the representation matrix with Aij = rij, the representation level of feature i in planning unit j. We set an objective to find the solution that fulfills all the targets and constraints for the smallest area, which we use as our measure of cost11. This objective is similar to that used in Marxan, the most widely used spatial conservation planning tool42, but has been shown to lead to more efficient solutions11.

### Spatiotemporal exploratory models

We used spatiotemporal exploratory models (STEM)9,13,43 to generate estimates of relative abundance for each species. STEM is a type of species distribution model created as an ensemble of local regression models generated from a spatiotemporal block subsampling design. Repeatedly subsampling and partitioning the study extent into grids of spatiotemporal blocks, and then fitting independent regression models (base models) in each block produces an ensemble of partially overlapping local models. Estimates at a given location and date are made by averaging across all the local models that contain the location and date. Combining estimates across the ensemble controls for inter-model variability44 and adapts to non-stationary predictor–response relationships13.

The ensemble of spatiotemporal blocks was designed as a Monte Carlo sample of 100 randomly located spatiotemporal partitions of the study extent. This results in a uniformly distributed set of spatiotemporal blocks, and up to 100 local models covering each location in the study extent. To account for spatial variation in the density of the bird observation data45, smaller spatiotemporal blocks (10° × 10° × 30 continuous days) were used north of 12° latitude and larger blocks (20° × 20° × 30 continuous days) were used in the southern portion of the study extent.

The bird observation data used to implement STEM came from the eBird citizen-science database46. The data included species counts from complete checklists collected under the ‘traveling’, ‘stationary’, and ‘areal’ protocols from January 1, 2004 to December 31, 2016 within the spatial extent bounded by 180° to 30° W Longitude (as well as Alaska between 150° E and 180° E). This resulted in a dataset consisting of 14 million checklists collected at 1.7 million unique locations, of which 10% were withheld for model validation.

Within each spatiotemporal block, species’ occupancy and abundance were both assumed to be stationary. If there were at least 50 checklists with at least 10 detections of the given species within the spatiotemporal block, we fit a two-step boosted regression tree model designed to deal with zero-inflation9 to predict the observed counts (abundance). If the minimum sample size requirements were not met for the spatiotemporal block, that spatiotemporal block was removed from the ensemble. The boosted regression trees for both steps of the zero-inflation model were fit with the gbm package47 with bag fraction = 0.80, learning rate or shrinkage = 0.05, and ntrees = 1000. The tree.depth parameter was set to 5 for the occurrence model and 10 for the abundance model, giving both models the ability to adapt to nonlinear and interacting predictor effects. We relied on the variance-reducing properties from averaging across the STEM ensemble to control for overfitting.

Three general classes of predictors were included in both boosted regression tree models: (i) spatial predictors to account for spatial (and spatiotemporal) patterns; (ii) temporal predictors to account for temporal variation at various scales; and (iii) predictors that describe the observation/detection process, which account for variation in detection rates, a nuisance when making inference about species occupancy and abundance.

Spatial information was captured using elevation48 and NASA MODIS land49 and water cover data. All 19 cover classes in the MODIS data were summarized within 2.8 × 2.8 km (784 ha) pixels centered at each eBird location. In each neighborhood, we computed the proportion of each class in the neighborhood (PLAND). To describe the spatial configuration of each class within each neighborhood we computed three statistics using FRAGSTATS50 and SDMTools51: LPI an index of the largest contiguous patch, PD an index of the patch density, and ED an index of the edge density. Summarizing the land-cover information at the 2.8 × 2.8 km resolution reduced the impact of erroneous cover classifications, and reduced the impact of inaccurate eBird checklist locations. The time of day was used to model variation in availability for detection; e.g., diurnal variation in behavior, such as participation in the ‘dawn chorus’52. Day of the year (1-366) was used to capture day-to-day changes in occupancy, and year was included to account for year-to-year differences. Finally, to account for variation in detection rates variables for the number of hours spent searching for species, the length of the transect traveled during the search, and the number of people in the search party were included in each base model.

Relative abundance was estimated as the expected count of birds of a given species on a standardized, hypothetical search conducted by a typical eBird participant starting from the center of the pixel from 7:00 to 8:00 a.m. while traveling 1 km. Estimates of relative abundance were rendered at weekly temporal resolution and 8.4 × 8.4 km spatial resolution and computed as the product of the estimated occupancy and, the estimated abundance conditional on occupancy, from the two steps of the boosted regression tree model.

Finally, to ensure good model performance, we required that the ensemble average computed for each relative abundance estimate have a sample size of at least 50 local models. In preliminary investigations based on expert review, we found that at least 50 local models were needed to control variation and limit extrapolation.

### Weekly and yearly approaches

The two approaches for full annual cycle conservation prioritization used here, were (i) weekly, where one spatial prioritization problem was solved per week and the 52 results where summed together to create a final solution where each cell was selected between 0 (i.e., never) to 52 times (i.e., selected in each solution); (ii) yearly, where the features from each week were included in one spatial prioritization problem across all 52 weeks of the year. The solution of this scenario resulted in a cell either being selected or not. For the yearly approach, each week of the year is its own feature in the optimization. This means, that for a weekly scenario we would have 117 features (one for each species) and in the yearly 117*52 features (one for each species in each week of the year). To illustrate how the two approaches, differ, we have created a theoretical example for one species and four weeks (Supplementary Fig. 5). The example represents a very small study area (40 cells in total vs. 4.3 million in our actual analysis) and the species has a total of 100 individuals in each week. The target for the species is to protect 30% (i.e., 30 individuals) of the population in each week. The example shows that in general the weekly approach captures areas important for each week of the year, independent of other weeks of the year. The ‘yearly’ approach on the other hand optimizes across the year, thereby minimizing area requirements, but as a trade-off it does not necessarily capture the most important areas in each week of the year.

### Sampling for spatial clustering of species abundance

Many of the species used here are represented by multiple sub-species or populations known or suspected to follow different migratory pathways and use different breeding or wintering habitats5,19,53. However, in the absence of detailed knowledge on migration pathways for the vast majority of species, we developed a system of stratified sampling to account for the weekly distribution and spatial structure of each of 117 focal species to ensure representation across their range throughout the annual cycle. To do so, we first conducted cluster analyses of predicted weekly distribution maps for all 117 species to identify 5 clusters of equal abundance that encompassed the entire species range to ensure representation across it. We sampled populations as 5 clusters because population structure information was missing for most species, to facilitate a data driven spatial delineation of population centers, and for computational efficiency. However, we note that the appropriate number of clusters among species may vary by species phylogeography or abundance should sufficient information become available. Our clustering approach re-draws the clusters in each week of the year, so does not consider whether spatial structure is maintained across weeks—it simply imposes a plausible structure within each week. Our cluster analysis was based on a dissimilarity matrix of geographic locations and abundances (which were weighted by 1/3 to primarily focus on geographic effects and not bias cluster delineation toward spatially separated abundance clusters), and used the CLARA algorithm, which is an extension of the k-medoids technique for large datasets54. For an example see Supplementary Fig. 6.

### Land use constraints

We used two metrics to constrain our systematic conservation prioritization. First, we used a constant, area-based cost metric for shared-use scenarios, whereby each planning unit was assigned a cost value of 1. In these scenarios, the optimization was driven solely by the species abundance predictions. Second, we used human footprint (2009; 1 km resolution)20 to identify areas more and less subject to human use, access or development pressures; specifically, we calculated the mean human footprint value for each 8.4 × 8.4 km pixel in our study area and used it as the ‘cost’ of each pixel during prioritization. We used the human footprint metric as the cost metric in intact habitat scenarios. The human footprint layer represents a composite including the following human pressures: (1) the extent of built environments; (2) crop land; (3) pasture land; (4) human population density; (5) night-time lights; (6) railways; (7) roads; and (8) navigable waterways20. The human footprint values range from 0 to 50, with 0 representing no human pressures and 50 representing the highest human pressure possible. We used the human footprint metric as a continuous constraint in the prioritization approach.

### Land cover representation

After the prioritization analyses, we summarized the major land cover types for each scenario that we generated. We used the 2015 data set of the global land cover map55 at a 300 m resolution and clipped the original data to the study area. For each scenario, we used the geospatial data abstraction library56 to warp the selected cells from the prioritization onto the raster grid of the land cover dataset. There were 37 land cover classes identified across scenarios and the frequency and area amount of each was summarized for all scenarios. As a final step we combined similar land cover classes into broader classes (Supplementary Table 2) and we used these to examine differences in area and land cover types selected under single season vs. full annual cycle planning and for intact habitat vs. shared-use scenarios (Table 2).

### Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.