The wild relatives of domesticated crops possess genetic diversity useful for developing more productive, nutritious and resilient crop varieties. However, their conservation status and availability for utilization are a concern, and have not been quantified globally. Here, we model the global distribution of 1,076 taxa related to 81 crops, using occurrence information collected from biodiversity, herbarium and gene bank databases. We compare the potential geographic and ecological diversity encompassed in these distributions with that currently accessible in gene banks, as a means to estimate the comprehensiveness of the conservation of genetic diversity. Our results indicate that the diversity of crop wild relatives is poorly represented in gene banks. For 313 (29.1% of total) taxa associated with 63 crops, no germplasm accessions exist, and a further 257 (23.9%) are represented by fewer than ten accessions. Over 70% of taxa are identified as high priority for further collecting in order to improve their representation in gene banks, and over 95% are insufficiently represented in regard to the full range of geographic and ecological variation in their native distributions. The most critical collecting gaps occur in the Mediterranean and the Near East, western and southern Europe, Southeast and East Asia, and South America. We conclude that a systematic effort is needed to improve the conservation and availability of crop wild relatives for use in plant breeding.

The challenges to global food security are complex and compounding. Our growing population and changing dietary expectations are projected to increase demand on food systems for at least the next four decades1,​2,​3,​4,​5, outpacing forecasted crop yield gains6. Limitations in land, water and other natural resource inputs, competition for arable soils with non-food crops and other land uses, soil degradation, climate change and the need to minimize harmful impacts on ecosystem services and biodiversity further constrain production potential3,4,7,8. Although gains in food availability may partially be obtained through dietary change and food waste reduction1,3, increases in the productivity, resilience and sustainability of current agricultural systems are clearly necessary5. Key to this sustainable intensification is the use of novel genetic diversity in plant breeding to produce crop varieties containing traits such as drought and heat tolerance, increased pest and disease resistance, and input use efficiency9,​10,​11.

As sources of new genetic diversity, crop wild relatives—the wild cousins of cultivated plant species—have been used for many decades for plant breeding, contributing a wide range of beneficial agronomic and nutritional traits12,​13,​14,​15,​16,​17. Their utilization is expected to increase as a result of ongoing improvements in information on species and their diversity and advances in breeding tools16,18. However, this expectation is based on the assumption that crop wild relatives will be readily available for research and plant breeding, which requires their conservation as germplasm accessions in gene banks as well as functioning mechanisms to enable access to this diversity10,11. Preliminary assessments of the comprehensiveness of conservation of wild relatives in gene banks have suggested substantial gaps19,20, and wild populations of a range of species are threatened by the conversion of natural habitats to agriculture, urbanization, invasive species, mining, climate change and/or pollution21,​22,​23. A concerted effort devoted to improving the conservation and availability of crop wild relatives for crop improvement is thus timely both for biodiversity conservation and for food security objectives24, as the window of opportunity to resolve these deficiencies will not remain open indefinitely20,22.

We conducted a detailed analysis of the extent of representation of the wild relatives of 81 crops in gene banks equipped to provide access to these genetic resources to the global research and breeding community. The crops include major and minor cereals, root and tuber crops, oilcrops, vegetables, fruits, forages and spices, chosen on the basis of their importance to food security, income generation and sustainable agricultural production (Supplementary Table 1). We first modelled the geographic distributions of a total of 1,076 unique crop wild relative taxa from 76 genera and 24 plant families (Supplementary Table 2). We then compared the potential geographic and ecological diversity encompassed in these distributions to that which is currently accessible in gene banks25. To aid conservation strategies, we categorized taxa with a final priority score (FPS) for further collecting from the natural habitats of crop wild relatives to increase representation in gene banks, on a scale from zero to ten. The FPS was created by averaging each taxon's assessed current representation in gene banks in regard to overall number of accessions, geographic diversity and ecological diversity. High priority for further collecting was assigned for taxa where FPS ≥ 7 (that is, very little or no current representation in gene banks); medium priority where 5 ≤ FPS < 7; low priority where 2.5 ≤ FPS < 5; and sufficiently represented for taxa with FPS < 2.5. Finally, we identified geographic hotspots where considerable richness of high-priority wild relative taxa is concentrated. Such sites represent particularly valuable targets, both for efficient collecting for ex situ conservation in gene banks and for in situ conservation in protected areas.


The distributions of crop wild relatives were modelled to occur on all continents except Antarctica, and throughout most of the tropics, subtropics and temperate regions, except the most arid areas and polar zones (Fig. 1). The greatest richness of taxa was modelled in the Mediterranean, Near East and southern Europe, South America, Southeast and East Asia, and Mesoamerica, with up to 84 taxa overlapping in a single 25 km2 grid cell. These richness hotspots largely align with traditionally recognized centres of crop diversity26, although the analysis also identified a number of less well-recognized areas, for example central and western Europe, the eastern USA, southeastern Africa and northern Australia, which also contain considerable richness. Hotspots in tropical and subtropical areas also largely aligned with zones recorded as possessing high richness of endemic flora and fauna, and experiencing exceptional degrees of loss of habitat27. Temperate regions identified under the same criteria, for example the California and Cape Floristic Provinces, southwestern Australia, central Chile and New Zealand, had considerably less overlap with areas rich in crop wild relatives.

Figure 1: Crop wild relative taxon richness map.
Figure 1

The map displays overlapping potential distribution models for assessed crop wild relatives. Dark red indicates greater overlap of potential distributions of taxa, that is, where greater numbers of crop wild relative taxa occur in the same geographic area.

Wild relative taxa as a class of plant genetic resources were found to be critically under-represented in gene banks. For 313 (29.1% of total) taxa associated with 63 crops, no germplasm accessions exist at all, and a further 257 taxa are represented by fewer than ten accessions. A total of 765 (71.1%) taxa were ranked as high priority for further collecting from their natural habitats, 148 (13.8%) as medium priority, 118 (11.0%) as low priority and only 45 (4.2%) as currently sufficiently represented in gene banks (Supplementary Table 2). The mean FPS across all species (7.9 ± 2.5 (mean ± s.d.)) fitted well within the high priority category range (Fig. 2). Lack of geographic and ecological representation in gene banks contributed significantly to most of the high FPS values, whereas less extreme gaps were generally evident in the total numbers of accessions conserved (Supplementary Fig. 1).

Figure 2: Collecting and conservation priorities for crop wild relatives by associated crop.
Figure 2

Black circles represent the FPS for further collecting for wild relative taxa, with larger grey circles representing the average FPS across taxa per crop gene pool. The blue straight vertical line represents the mean FPS across all crop wild relative taxa within all crop gene pools.

An analysis of wild relatives grouped by their associated crop (that is, by crop gene pool) revealed that 72% of the crop gene pools had been assigned to high priority for further collecting (as an average of FPS scores across associated wild relative taxa), and thus require urgent conservation action (Fig. 2). These included the gene pools of commodity crops of critical importance to global food supplies and/or agricultural production, for example sugarcane (9.2 ± 1.6), sugar beet (8.1 ± 1.6) and maize (6.9 ± 2.1), as well as important food security staples such as banana and plantain (9.4 ± 0.8), cassava (9.0 ± 1.6), sorghum (8.8 ± 1.0), yams (8.5 ± 2.9), cowpea (8.4 ± 1.7), sweet potato (8.4 ± 1.7), pigeon pea (8.4 ± 1.1), millets (8.4 ± 2.7) and groundnut (7.6 ± 1.8) (Fig. 3 and Supplementary Table 1). High priority was also assigned to the gene pools of numerous crops important for smallholder income generation in the tropics (for example, cacao and papaya) and minor crops increasing in popularity because of their nutritional qualities (quinoa), as well as various other important fruits (for example, grape, apple, watermelon, orange and mango), oilcrops (rapeseed) and forages (alfalfa) possessing considerable numbers of wild related taxa. Although all gene pools contained taxa with considerable conservation concerns, the wild relatives of fruits, forages, sugar crops, starchy roots and vegetables were those assessed as least well represented in gene banks (Supplementary Fig. 2). Average FPS values across all wild relatives per crop type were 8.8 ± 1.8 for fruits, 8.7 ± 1.7 for forages, 8.6 ± 1.6 for sugar crops, 8.2 ± 2.3 for starchy roots, 8.1 ± 2.4 for vegetables, 7.2 ± 2.6 for pulses, 7.1 ± 2.3 for oilcrops, 7.1 ± 1.9 for spices and 6.4 ± 3.1 for cereals.

Figure 3: Collecting priorities for crop wild relatives and the importance of associated crops.
Figure 3

The priority scale displays the average FPS across wild relatives per crop. The mean importance class of associated crops displays the significance of crops averaged across global food supplies and agricultural production metrics (see Supplementary Methods). For both axes, the scale is zero to ten, with ten representing the highest priority for further collecting/most important crop. The size of crop gene pool circles denotes the number of wild relative taxa per crop, ranging from 1 (faba bean) to 135 (cassava).

None of the 81 assessed crop gene pools demonstrated an average FPS across its wild relatives that would permit its categorization as sufficiently well represented in gene banks (Fig. 2). The wild relatives of six crops were assessed as fairly well represented, that is low current priority for further collecting for the gene pools of wheat (3.7 ± 2.4), grass pea (3.7 ± 2.0), chickpea (4.2 ± 2.6) and tomato (4.5 ± 1.9). Wheat and tomato, along with medium-priority crop gene pools such as sunflower (6.3 ± 2.2), rice (6.6 ± 2.5) and potato (6.7 ± 2.6), have a long history of use of wild relatives in crop improvement9,13 and benefit from relatively extensive germplasm collections. Other crop gene pools determined as low priority (grass pea and chickpea) have few wild relatives, and these generally present restricted distributions that have been fairly well sampled. However, specific taxa were assessed as under-represented in gene banks even within these low-priority gene pools. For example, five taxa related to wheat were assessed as medium or high priority, one taxon related to grass pea as medium priority, three taxa related to chickpea as medium priority and six taxa related to tomato as medium or high priority (Supplementary Table 2).

Proposed hotspots for further collecting for high-priority crop wild relatives were identified across the world's tropical, subtropical and temperate regions, with the most critical gaps identified in the Mediterranean, Near East, and southern and western Europe; Southeast and East Asia; and South America (Fig. 4). Up to 43 wild relative taxa (main map in Fig. 4) associated with up to 23 crops (inset map in Fig. 4) may potentially be collected within a single 25 km2 grid cell.

Figure 4: Proposed hotspots for further collecting activities for high-priority crop wild relatives.
Figure 4

The map displays geographic regions where high-priority crop wild relative taxa are expected to occur and have not yet been collected and conserved in gene banks. The inset map shows gaps for under-represented taxa by crop gene pool. Dark red indicates greater overlap of potential distributions of under-represented taxa, where greater numbers of under-represented crop wild relative taxa occur in the same geographic area. For the inset map, greater numbers indicate greater overlap of taxa associated with various crops.


Our results demonstrate that crop wild relatives are currently under-represented and a systematic effort to improve their comprehensiveness in gene banks is critically needed. These findings are remarkable given the extensive efforts particularly in the past half century by international, regional and national initiatives to conserve the broad diversity of important agricultural crops11,20. Achieving the comprehensive conservation of crop genetic resources ex situ is constrained by technical as well as political and funding challenges in recent decades11, and is most poignant for wild taxa, which are less well researched than crop species and often more difficult to conserve and to utilize11,20,24. Addressing conservation gaps globally for crop wild relatives, a goal that is specifically targeted in recent major international agreements (the United Nations' Sustainable Development Goals and the Strategic Plan for Biodiversity28) will require substantial investment and extensive international collaboration. The high spatial resolution of these results is already informing such initiatives24 and can be useful to the development of further efforts.

Here we outline priorities for collecting wild relatives on the basis of their current representation in gene banks (Fig. 2 and Supplementary Table 2), and also provide an assessment of the relative importance to global food supplies and production systems worldwide of their associated crops (Fig. 3 and Supplementary Fig. 2), as well as additional information regarding the contribution of crops to food security and sustainable agriculture (Supplementary Table 1). We recommend filling gaps in ex situ conservation first for the wild relatives of crops significant to these criteria, for example rice, maize, sugarcane, cassava, potato, bananas and plantains, sorghum, millets, sweet potato, yams, groundnut, cowpea and pigeon pea.

To further refine these priorities, additional information and filters are needed. These include incorporating knowledge of threats to populations due to habitat modification, climate change and other impacts. Preliminary field surveys and threat analyses for under-represented taxa are therefore urgently needed. We note that extensive expert evaluations of the results generally confirmed the robustness of our species distribution models and conservation prioritizations but also clearly emphasized the need to address urgent threats to the survival of many crop wild relative populations (Supplementary Fig. 3). Realistic strategies for field collecting and subsequent ex situ conservation resulting in an increased availability of germplasm for plant breeding also require negotiating policy governing germplasm collecting and exchange29,30, assessing field work risks (for example, war and civil strife in regions with high levels of diversity of wild relatives), coordinating timing of field work to maximize the collection of viable seeds and other propagules, prioritizing target crop gene pools based on the interest of the breeding community in utilizing wild germplasm, and determining the relative difficulty of maintenance of targeted wild germplasm in gene banks. Although the seeds of most wild relatives can be maintained under standard conditions for long-term conservation ex situ, some wild relatives produce recalcitrant seeds or do not produce seeds at all. Such wild relatives may require more expensive approaches (for example, in vitro or cryopreservation), and particularly for such taxa alternative conservation strategies such as the establishment of in situ conservation reserves may be more effective.

Despite an extensive effort to compile occurrence records from more than 400 different data sources, the wild relatives of a number of important agricultural crops (namely coffee, tea and avocado) were not assessed because of the lack of sufficient accessible data. We also note that a number of agricultural crops are not currently known to possess closely related wild relatives, including taro (Colocasia esculenta), coconut (Cocos nucifera) and date palm (Phoenix dactylifera). Improvements in the generation and accessibility of taxonomic, relatedness and geographic information on wild relatives19,31 may permit conservation assessments for some of these gene pools in the future.

The combination of the sampling, geographic and ecological representativeness scores used to determine the extent of conservation of the wild relatives of important agricultural crops in gene banks represents an efficient methodology for prioritizing taxa across crop gene pools given wide variations in the potential diversity encompassed in each taxon and the general absence of molecular data for such species. The sampling representativeness score permitted an indication of the total number of germplasm accessions estimated as sufficient to represent a taxon, relative to the known extent of the taxon and utilizing all gene bank and reference data regardless of whether geographical coordinates are available. Geographic and ecological variation metrics were used as proxy for genetic diversity and potential functional adaptation to diverse environments, based on the assumption that the genetic composition of plant species varies across geographic range and is associated with adaptation to different ecological conditions32. The increasing power and decreasing costs of direct measures of diversity in genomes may make significant future refinements of priorities achievable10. However, further collecting is still needed for a very large number of wild relatives in order to assemble sufficient samples to perform such genetic assessments and to help resolve taxonomic and gene pool assignment uncertainties33.


Methods used for gathering data, modelling, analyses and the associated references are available in the Supplementary Information.

Interactive maps displaying occurrence data coordinates, potential distribution models, further collecting priority maps and collecting priority categories for the crop wild relatives analysed are available at http://www.cwrdiversity.org/distribution-map/. Occurrence data used for this analysis are available at http://www.cwrdiversity.org/checklist/cwr-occurrences.php. Further information on expert evaluations of the gap analysis are available at http://www.cwrdiversity.org/expert-evaluation/.


  1. 1.

    & Global diets link environmental sustainability and human health. Nature 515, 518–522 (2014).

  2. 2.

    et al. Increasing homogeneity in global food supplies and the implications for food security. Proc. Natl Acad. Sci. USA 111, 4001–4006 (2014).

  3. 3.

    et al. Solutions for a cultivated planet. Nature 478, 337–342 (2011).

  4. 4.

    et al. Food security: the challenge of feeding 9 billion people. Science 327, 812–818 (2010).

  5. 5.

    , , & Global food demand and the sustainable intensification of agriculture. Proc. Natl Acad. Sci. USA 108, 20260–20264 (2011).

  6. 6.

    , , & Yield trends are insufficient to double global crop production by 2050. PLoS One 8, e66428 (2013).

  7. 7.

    , & The story of phosphorus: global food security and food for thought. Global Environ. Change 19, 292–305 (2009).

  8. 8.

    et al. Rising temperatures reduce global wheat production. Nature Clim. Change 5, 143–147 (2015).

  9. 9.

    et al. Genetic diversity and disease control in rice. Nature 406, 718–722 (2000).

  10. 10.

    et al. Feeding the future. Nature 499, 23–24 (2013).

  11. 11.

    Protecting crop genetic diversity for food security: political, ethical and technical challenges. Nature Rev. Genet. 6, 946–953 (2005).

  12. 12.

    & The use of wild relatives in crop improvement: a survey of developments over the last 20 years. Euphytica 156, 1–13 (2007).

  13. 13.

    , , , & Genes from wild rice improve yield. Nature 384, 356–358 (1996).

  14. 14.

    & Unused natural variation can lift yield barriers in plant breeding. PLoS Biol. 2, e245 (2004).

  15. 15.

    et al. Through the genetic bottleneck: O. rufipogon as a source of trait-enhancing alleles for O. sativa. Euphytica 154, 317–339 (2007).

  16. 16.

    & Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277, 1063–1066 (1997).

  17. 17.

    & A walk on the wild side. Nature Clim. Change 1, 374–375 (2011).

  18. 18.

    , , & Genomics of gene banks: a case study in rice. Am. J. Bot. 99, 407–423 (2012).

  19. 19.

    et al. A prioritized crop wild relative inventory to help underpin global food security. Biol. Conserv. 167, 265–275 (2013).

  20. 20.

    Food and Agriculture Organization of the United Nations (FAO) The Second Report on the State of the World‘s Plant Genetic Resources for Food and Agriculture (Commission on Genetic Resources for Food and Agriculture, FAO, 2010).

  21. 21.

    , & The effect of climate change on crop wild relatives. Agric. Ecosyst. Environ. 126, 13–23 (2008).

  22. 22.

    Urgent notice to all maize researchers: disappearance and extinction of the last wild Teosinte population is more than half completed. A modest proposal for Teosinte evolution and conservation in situ: the Balsas, Guerrero, Mexico. Maydica 52, 49–58 (2007).

  23. 23.

    et al. Green plants in the red: a baseline global assessment for the IUCN sampled Red List Index for plants. PLoS One 10, e0135152 (2015).

  24. 24.

    et al. Adapting agriculture to climate change: a global initiative to collect, conserve, and use crop wild relatives. Agroecol. Sustain. Food Syst. 38, 369–377 (2013).

  25. 25.

    , , , & A gap analysis methodology for collecting crop gene pools: a case study with Phaseolus beans. PLoS One 5, e13497 (2010).

  26. 26.

    Centers of origin of cultivated plants. Bull. Appl. Bot. Plant Breed. 16, (1926).

  27. 27.

    , , , & Biodiversity hotspots for conservation priorities. Nature 403, 853–858 (2000).

  28. 28.

    Secretariat of the Convention on Biological Diversity (CBD). Decision X/2. The Strategic Plan for Biodiversity 2011-2020 and the Aichi Biodiversity Targets (2010).

  29. 29.

    Food and Agriculture Organization of the United Nations (FAO). International Treaty on Plant Genetic Resources for Food and Agriculture (2009).

  30. 30.

    Secretariat of the Convention on Biological Diversity (CBD). Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity (2011).

  31. 31.

    , & in ISHS Acta Horticulturae 948 I International Symposium Wild Relatives of Subtropical Temperate Fruit and Nut Crops (eds Aradhya, M. K. & Kluepfel, D. A.) 285–288 (ISHS, 2012).

  32. 32.

    , & Genetic variation across species’ geographical ranges: the central-marginal hypothesis and beyond. Mol. Ecol. 17, 1170–1188 (2008).

  33. 33.

    , , , & Widespread mistaken identity in tropical plant collections. Curr. Biol. 25, R1066–R1067 (2015).

Download references


We thank J. Wiersema and B. León for major contributions to taxonomic concepts; the herbaria, gene banks, researchers and other sources that contributed occurrence data to the analysis (Supplementary Table 3); the expert evaluators of gap analysis results (Supplementary Table 4); S. Calderón, I. Vanegas, H. Tobón, D. Arango, H. Dorado and E. Guevara for data inputs and processing; and S. Prager for comments. This work was undertaken as part of the project ‘Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives’, which is supported by the Government of Norway. The project is managed by the Global Crop Diversity Trust and the Millennium Seed Bank of the Royal Botanic Gardens, Kew, and implemented in partnership with national and international gene banks and plant breeding institutes around the world. For further information, visit the project website: http://www.cwrdiversity.org/. Funding was also provided by the CGIAR Research Program on Climate Change, Agriculture, and Food Security, Cali, Colombia.

Author information

Author notes

    • Nora P. Castañeda-Álvarez
    •  & Colin K. Khoury

    These authors contributed equally to this work.


  1. International Center for Tropical Agriculture (CIAT), Km 17, Recta Cali-Palmira, Cali 763537, Colombia

    • Nora P. Castañeda-Álvarez
    • , Colin K. Khoury
    • , Harold A. Achicanoy
    • , Vivian Bernau
    • , Andy Jarvis
    • , Julian Ramirez-Villegas
    •  & Chrystian C. Sosa
  2. School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK

    • Nora P. Castañeda-Álvarez
    • , Nigel Maxted
    •  & Holly Vincent
  3. Centre for Crop Systems Analysis, Wageningen University, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands

    • Colin K. Khoury
    •  & Paul C. Struik
  4. Global Crop Diversity Trust, Platz der Vereinten Nationen 7, 53115 Bonn, Germany

    • Hannes Dempewolf
    • , Luigi Guarino
    •  & Jane Toll
  5. Royal Botanic Gardens, Kew, Conservation Science, Millennium Seed Bank, Wakehurst Place, Ardingly RH17 6TN, UK

    • Ruth J. Eastwood
    • , Ruth H. Harker
    •  & Jonas V. Müller
  6. CGIAR Research Program on Climate Change, Agriculture and Food Security (CCAFS), Km 17, Recta Cali-Palmira, Cali 763537, Colombia

    • Andy Jarvis
    •  & Julian Ramirez-Villegas
  7. Institute for Climate and Atmospheric Science, School of Earth and Environment, University of Leeds, LS2 9JT, UK

    • Julian Ramirez-Villegas


  1. Search for Nora P. Castañeda-Álvarez in:

  2. Search for Colin K. Khoury in:

  3. Search for Harold A. Achicanoy in:

  4. Search for Vivian Bernau in:

  5. Search for Hannes Dempewolf in:

  6. Search for Ruth J. Eastwood in:

  7. Search for Luigi Guarino in:

  8. Search for Ruth H. Harker in:

  9. Search for Andy Jarvis in:

  10. Search for Nigel Maxted in:

  11. Search for Jonas V. Müller in:

  12. Search for Julian Ramirez-Villegas in:

  13. Search for Chrystian C. Sosa in:

  14. Search for Paul C. Struik in:

  15. Search for Holly Vincent in:

  16. Search for Jane Toll in:


N.P.C.-A., C.K.K., H.D., R.J.E., L.G., A.J., N.M., J.M., J.R-V. and J.T. conceived and designed the study. N.P.C.-A., C.K.K., H.D., R.J.E., R.H.H., A.J., N.M., J.R-V., C.C.S. and H.V. acquired and contributed data. N.P.C.-A., C.K.K., H.A.A., V.B. and C.C.S. processed the data, performed the analyses and analysed the results. N.P.C.-A., C.K.K., H.D., R.J.E., L.G., A.J., N.M. and J.M. interpreted the results and wrote the manuscript. N.P.C.-A., C.K.K., V.B., H.D., R.J.E., L.G., A.J., N.M., J.M., J.R-V. and P.C.S. edited the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Nora P. Castañeda-Álvarez.

Supplementary information