The wild relatives of domesticated crops possess genetic diversity useful for developing more productive, nutritious and resilient crop varieties. However, their conservation status and availability for utilization are a concern, and have not been quantified globally. Here, we model the global distribution of 1,076 taxa related to 81 crops, using occurrence information collected from biodiversity, herbarium and gene bank databases. We compare the potential geographic and ecological diversity encompassed in these distributions with that currently accessible in gene banks, as a means to estimate the comprehensiveness of the conservation of genetic diversity. Our results indicate that the diversity of crop wild relatives is poorly represented in gene banks. For 313 (29.1% of total) taxa associated with 63 crops, no germplasm accessions exist, and a further 257 (23.9%) are represented by fewer than ten accessions. Over 70% of taxa are identified as high priority for further collecting in order to improve their representation in gene banks, and over 95% are insufficiently represented in regard to the full range of geographic and ecological variation in their native distributions. The most critical collecting gaps occur in the Mediterranean and the Near East, western and southern Europe, Southeast and East Asia, and South America. We conclude that a systematic effort is needed to improve the conservation and availability of crop wild relatives for use in plant breeding.
The challenges to global food security are complex and compounding. Our growing population and changing dietary expectations are projected to increase demand on food systems for at least the next four decades1,
As sources of new genetic diversity, crop wild relatives—the wild cousins of cultivated plant species—have been used for many decades for plant breeding, contributing a wide range of beneficial agronomic and nutritional traits12,
We conducted a detailed analysis of the extent of representation of the wild relatives of 81 crops in gene banks equipped to provide access to these genetic resources to the global research and breeding community. The crops include major and minor cereals, root and tuber crops, oilcrops, vegetables, fruits, forages and spices, chosen on the basis of their importance to food security, income generation and sustainable agricultural production (Supplementary Table 1). We first modelled the geographic distributions of a total of 1,076 unique crop wild relative taxa from 76 genera and 24 plant families (Supplementary Table 2). We then compared the potential geographic and ecological diversity encompassed in these distributions to that which is currently accessible in gene banks25. To aid conservation strategies, we categorized taxa with a final priority score (FPS) for further collecting from the natural habitats of crop wild relatives to increase representation in gene banks, on a scale from zero to ten. The FPS was created by averaging each taxon's assessed current representation in gene banks in regard to overall number of accessions, geographic diversity and ecological diversity. High priority for further collecting was assigned for taxa where FPS ≥ 7 (that is, very little or no current representation in gene banks); medium priority where 5 ≤ FPS < 7; low priority where 2.5 ≤ FPS < 5; and sufficiently represented for taxa with FPS < 2.5. Finally, we identified geographic hotspots where considerable richness of high-priority wild relative taxa is concentrated. Such sites represent particularly valuable targets, both for efficient collecting for ex situ conservation in gene banks and for in situ conservation in protected areas.
The distributions of crop wild relatives were modelled to occur on all continents except Antarctica, and throughout most of the tropics, subtropics and temperate regions, except the most arid areas and polar zones (Fig. 1). The greatest richness of taxa was modelled in the Mediterranean, Near East and southern Europe, South America, Southeast and East Asia, and Mesoamerica, with up to 84 taxa overlapping in a single 25 km2 grid cell. These richness hotspots largely align with traditionally recognized centres of crop diversity26, although the analysis also identified a number of less well-recognized areas, for example central and western Europe, the eastern USA, southeastern Africa and northern Australia, which also contain considerable richness. Hotspots in tropical and subtropical areas also largely aligned with zones recorded as possessing high richness of endemic flora and fauna, and experiencing exceptional degrees of loss of habitat27. Temperate regions identified under the same criteria, for example the California and Cape Floristic Provinces, southwestern Australia, central Chile and New Zealand, had considerably less overlap with areas rich in crop wild relatives.
Wild relative taxa as a class of plant genetic resources were found to be critically under-represented in gene banks. For 313 (29.1% of total) taxa associated with 63 crops, no germplasm accessions exist at all, and a further 257 taxa are represented by fewer than ten accessions. A total of 765 (71.1%) taxa were ranked as high priority for further collecting from their natural habitats, 148 (13.8%) as medium priority, 118 (11.0%) as low priority and only 45 (4.2%) as currently sufficiently represented in gene banks (Supplementary Table 2). The mean FPS across all species (7.9 ± 2.5 (mean ± s.d.)) fitted well within the high priority category range (Fig. 2). Lack of geographic and ecological representation in gene banks contributed significantly to most of the high FPS values, whereas less extreme gaps were generally evident in the total numbers of accessions conserved (Supplementary Fig. 1).
An analysis of wild relatives grouped by their associated crop (that is, by crop gene pool) revealed that 72% of the crop gene pools had been assigned to high priority for further collecting (as an average of FPS scores across associated wild relative taxa), and thus require urgent conservation action (Fig. 2). These included the gene pools of commodity crops of critical importance to global food supplies and/or agricultural production, for example sugarcane (9.2 ± 1.6), sugar beet (8.1 ± 1.6) and maize (6.9 ± 2.1), as well as important food security staples such as banana and plantain (9.4 ± 0.8), cassava (9.0 ± 1.6), sorghum (8.8 ± 1.0), yams (8.5 ± 2.9), cowpea (8.4 ± 1.7), sweet potato (8.4 ± 1.7), pigeon pea (8.4 ± 1.1), millets (8.4 ± 2.7) and groundnut (7.6 ± 1.8) (Fig. 3 and Supplementary Table 1). High priority was also assigned to the gene pools of numerous crops important for smallholder income generation in the tropics (for example, cacao and papaya) and minor crops increasing in popularity because of their nutritional qualities (quinoa), as well as various other important fruits (for example, grape, apple, watermelon, orange and mango), oilcrops (rapeseed) and forages (alfalfa) possessing considerable numbers of wild related taxa. Although all gene pools contained taxa with considerable conservation concerns, the wild relatives of fruits, forages, sugar crops, starchy roots and vegetables were those assessed as least well represented in gene banks (Supplementary Fig. 2). Average FPS values across all wild relatives per crop type were 8.8 ± 1.8 for fruits, 8.7 ± 1.7 for forages, 8.6 ± 1.6 for sugar crops, 8.2 ± 2.3 for starchy roots, 8.1 ± 2.4 for vegetables, 7.2 ± 2.6 for pulses, 7.1 ± 2.3 for oilcrops, 7.1 ± 1.9 for spices and 6.4 ± 3.1 for cereals.
None of the 81 assessed crop gene pools demonstrated an average FPS across its wild relatives that would permit its categorization as sufficiently well represented in gene banks (Fig. 2). The wild relatives of six crops were assessed as fairly well represented, that is low current priority for further collecting for the gene pools of wheat (3.7 ± 2.4), grass pea (3.7 ± 2.0), chickpea (4.2 ± 2.6) and tomato (4.5 ± 1.9). Wheat and tomato, along with medium-priority crop gene pools such as sunflower (6.3 ± 2.2), rice (6.6 ± 2.5) and potato (6.7 ± 2.6), have a long history of use of wild relatives in crop improvement9,13 and benefit from relatively extensive germplasm collections. Other crop gene pools determined as low priority (grass pea and chickpea) have few wild relatives, and these generally present restricted distributions that have been fairly well sampled. However, specific taxa were assessed as under-represented in gene banks even within these low-priority gene pools. For example, five taxa related to wheat were assessed as medium or high priority, one taxon related to grass pea as medium priority, three taxa related to chickpea as medium priority and six taxa related to tomato as medium or high priority (Supplementary Table 2).
Proposed hotspots for further collecting for high-priority crop wild relatives were identified across the world's tropical, subtropical and temperate regions, with the most critical gaps identified in the Mediterranean, Near East, and southern and western Europe; Southeast and East Asia; and South America (Fig. 4). Up to 43 wild relative taxa (main map in Fig. 4) associated with up to 23 crops (inset map in Fig. 4) may potentially be collected within a single 25 km2 grid cell.
Our results demonstrate that crop wild relatives are currently under-represented and a systematic effort to improve their comprehensiveness in gene banks is critically needed. These findings are remarkable given the extensive efforts particularly in the past half century by international, regional and national initiatives to conserve the broad diversity of important agricultural crops11,20. Achieving the comprehensive conservation of crop genetic resources ex situ is constrained by technical as well as political and funding challenges in recent decades11, and is most poignant for wild taxa, which are less well researched than crop species and often more difficult to conserve and to utilize11,20,24. Addressing conservation gaps globally for crop wild relatives, a goal that is specifically targeted in recent major international agreements (the United Nations' Sustainable Development Goals and the Strategic Plan for Biodiversity28) will require substantial investment and extensive international collaboration. The high spatial resolution of these results is already informing such initiatives24 and can be useful to the development of further efforts.
Here we outline priorities for collecting wild relatives on the basis of their current representation in gene banks (Fig. 2 and Supplementary Table 2), and also provide an assessment of the relative importance to global food supplies and production systems worldwide of their associated crops (Fig. 3 and Supplementary Fig. 2), as well as additional information regarding the contribution of crops to food security and sustainable agriculture (Supplementary Table 1). We recommend filling gaps in ex situ conservation first for the wild relatives of crops significant to these criteria, for example rice, maize, sugarcane, cassava, potato, bananas and plantains, sorghum, millets, sweet potato, yams, groundnut, cowpea and pigeon pea.
To further refine these priorities, additional information and filters are needed. These include incorporating knowledge of threats to populations due to habitat modification, climate change and other impacts. Preliminary field surveys and threat analyses for under-represented taxa are therefore urgently needed. We note that extensive expert evaluations of the results generally confirmed the robustness of our species distribution models and conservation prioritizations but also clearly emphasized the need to address urgent threats to the survival of many crop wild relative populations (Supplementary Fig. 3). Realistic strategies for field collecting and subsequent ex situ conservation resulting in an increased availability of germplasm for plant breeding also require negotiating policy governing germplasm collecting and exchange29,30, assessing field work risks (for example, war and civil strife in regions with high levels of diversity of wild relatives), coordinating timing of field work to maximize the collection of viable seeds and other propagules, prioritizing target crop gene pools based on the interest of the breeding community in utilizing wild germplasm, and determining the relative difficulty of maintenance of targeted wild germplasm in gene banks. Although the seeds of most wild relatives can be maintained under standard conditions for long-term conservation ex situ, some wild relatives produce recalcitrant seeds or do not produce seeds at all. Such wild relatives may require more expensive approaches (for example, in vitro or cryopreservation), and particularly for such taxa alternative conservation strategies such as the establishment of in situ conservation reserves may be more effective.
Despite an extensive effort to compile occurrence records from more than 400 different data sources, the wild relatives of a number of important agricultural crops (namely coffee, tea and avocado) were not assessed because of the lack of sufficient accessible data. We also note that a number of agricultural crops are not currently known to possess closely related wild relatives, including taro (Colocasia esculenta), coconut (Cocos nucifera) and date palm (Phoenix dactylifera). Improvements in the generation and accessibility of taxonomic, relatedness and geographic information on wild relatives19,31 may permit conservation assessments for some of these gene pools in the future.
The combination of the sampling, geographic and ecological representativeness scores used to determine the extent of conservation of the wild relatives of important agricultural crops in gene banks represents an efficient methodology for prioritizing taxa across crop gene pools given wide variations in the potential diversity encompassed in each taxon and the general absence of molecular data for such species. The sampling representativeness score permitted an indication of the total number of germplasm accessions estimated as sufficient to represent a taxon, relative to the known extent of the taxon and utilizing all gene bank and reference data regardless of whether geographical coordinates are available. Geographic and ecological variation metrics were used as proxy for genetic diversity and potential functional adaptation to diverse environments, based on the assumption that the genetic composition of plant species varies across geographic range and is associated with adaptation to different ecological conditions32. The increasing power and decreasing costs of direct measures of diversity in genomes may make significant future refinements of priorities achievable10. However, further collecting is still needed for a very large number of wild relatives in order to assemble sufficient samples to perform such genetic assessments and to help resolve taxonomic and gene pool assignment uncertainties33.
Methods used for gathering data, modelling, analyses and the associated references are available in the Supplementary Information.
Interactive maps displaying occurrence data coordinates, potential distribution models, further collecting priority maps and collecting priority categories for the crop wild relatives analysed are available at http://www.cwrdiversity.org/distribution-map/. Occurrence data used for this analysis are available at http://www.cwrdiversity.org/checklist/cwr-occurrences.php. Further information on expert evaluations of the gap analysis are available at http://www.cwrdiversity.org/expert-evaluation/.
We thank J. Wiersema and B. León for major contributions to taxonomic concepts; the herbaria, gene banks, researchers and other sources that contributed occurrence data to the analysis (Supplementary Table 3); the expert evaluators of gap analysis results (Supplementary Table 4); S. Calderón, I. Vanegas, H. Tobón, D. Arango, H. Dorado and E. Guevara for data inputs and processing; and S. Prager for comments. This work was undertaken as part of the project ‘Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives’, which is supported by the Government of Norway. The project is managed by the Global Crop Diversity Trust and the Millennium Seed Bank of the Royal Botanic Gardens, Kew, and implemented in partnership with national and international gene banks and plant breeding institutes around the world. For further information, visit the project website: http://www.cwrdiversity.org/. Funding was also provided by the CGIAR Research Program on Climate Change, Agriculture, and Food Security, Cali, Colombia.