Article | Published:

Genebank genomics highlights the diversity of a global barley collection

Nature Genetics (2018) | Download Citation

Abstract

Genebanks hold comprehensive collections of cultivars, landraces and crop wild relatives of all major food crops, but their detailed characterization has so far been limited to sparse core sets. The analysis of genome-wide genotyping-by-sequencing data for almost all barley accessions of the German ex situ genebank provides insights into the global population structure of domesticated barley and points out redundancies and coverage gaps in one of the world’s major genebanks. Our large sample size and dense marker data afford great power for genome-wide association scans. We detect known and novel loci underlying morphological traits differentiating barley genepools, find evidence for convergent selection for barbless awns in barley and rice and show that a major-effect resistance locus conferring resistance to bymovirus infection has been favored by traditional farmers. This study outlines future directions for genomics-assisted genebank management and the utilization of germplasm collections for linking natural variation to human selection during crop evolution.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

Sequence data collected in this study have been deposited at the European Nucleotide Archive (accession numbers PRJEB23967, PRJEB24563, PRJEB24627, PRJEB26634, PRJEB26652 and PRJEB27184; Supplementary Table 1). SNP matrices and phenotypic data have been deposited at https://doi.org/10.5447/IPK/2018/9. Passport data for all accessions are reported in Supplementary Table 1. Phenotypic data used for GWAS are reported in Supplementary Table 4 (morphological characters), Supplementary Table 6 (virus resistance), and at https://doi.org/10.5447/IPK/2018/10 (flowering time). Passport, phenotypic and sequence data can be browsed in the BRIDGE web portal (http://bridge.ipk-gatersleben.de).

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).

  2. 2.

    Russell, J. et al. Exome sequencing of geographically diverse barley landraces and wild relatives gives insights into environmental adaptation. Nat. Genet. 48, 1024–1030 (2016).

  3. 3.

    Brachi, B., Morris, G. P. & Borevitz, J. O. Genome-wide association studies in plants: the missing heritability is in the field. Genome. Biol. 12, 232 (2011).

  4. 4.

    Lopes, M. S. et al. Exploiting genetic diversity from landraces in wheat breeding for adaptation to climate change. J. Exp. Bot. 66, 3477–3486 (2015).

  5. 5.

    Oppermann, M., Weise, S., Dittmann, C. & Knüpffer, H. GBIS: the information system of the German Genebank. Database 2015, bav021 (2015).

  6. 6.

    Pourkheirandish, M. et al. Evolution of the grain dispersal system in barley. Cell 162, 527–539 (2015).

  7. 7.

    Poland, J. A., Brown, P. J., Sorrells, M. E. & Jannink, J.-L. Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. PLoS ONE 7, e32253 (2012).

  8. 8.

    Wendler, N. et al. Unlocking the secondary gene-pool of barley with next-generation sequencing. Plant Biotechnol. J. 12, 1122–1131 (2014).

  9. 9.

    Mascher, M. et al. A chromosome conformation capture ordered sequence of the barley genome. Nature 544, 427–433 (2017).

  10. 10.

    Jakob, S. S. et al. Evolutionary history of wild barley (Hordeum vulgare subsp. spontaneum) analyzed using multilocus sequence data and paleodistribution modeling. Genome Biol. Evol. 6, 685–702 (2014).

  11. 11.

    Chen, F. H. et al. Agriculture facilitated permanent human occupation of the Tibetan Plateau after 3600 bp. Science 347, 248–250 (2015).

  12. 12.

    Badr, A. et al. On the origin and domestication history of barley (Hordeum vulgare). Mol. Biol. Evol. 17, 499–510 (2000).

  13. 13.

    Blattner, F. R. & Méndez, A. G. B. RAPD data do not support a second centre of barley domestication in Morocco. Genet. Resour. Crop Evol. 48, 13–19 (2001).

  14. 14.

    Pourkheirandish, M. et al. Elucidation of the origin of ‘agriocrithon’ based on domestication genes questions the hypothesis that Tibet is one of the centers of barley domestication. Plant J. 94, 525–534 (2018).

  15. 15.

    Galinsky, K. J. et al. Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia. Am. J. Hum. Genet. 98, 456–472 (2016).

  16. 16.

    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

  17. 17.

    Knüpffer, H. & van Hintum, T. J. L. in Core Collections of Plant Genetic Resources (eds Hodgkin, T., Brown, A. H. D., van Hintum, T. J. L. & Morales, E. A. V.) 171–178 (John Wiley and Sons, Chichester, UK, 1995).

  18. 18.

    van Hintum, T. J. L. & Knüpffer, H. Duplication within and between germplasm collections. I. Genet. Resour. Crop Evol. 42, 127–133 (1995).

  19. 19.

    van Hintum, T. J. L. & Visser, D. L. Duplication within and between germplasm collections. II. Genet. Resour. Crop Evol. 42, 135–145 (1995).

  20. 20.

    Parzies, H., Spoor, W. & Ennos, R. Genetic diversity of barley landrace accessions (Hordeum vulgare ssp. vulgare) conserved for different lengths of time in ex situ gene banks. Heredity 84, 476 (2000).

  21. 21.

    Harlan, J. R. Ethiopia: a center of diversity. Econ. Bot. 23, 309–314 (1969).

  22. 22.

    Castañeda-Álvarez, N. P. et al. Global conservation priorities for crop wild relatives. Nat. Plants 2, 16022 (2016).

  23. 23.

    Swarts, K. et al. Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants. Plant Genome 7, https://doi.org/10.3835/plantgenome2014.05.0023 (2014).

  24. 24.

    Mansfeld, R. Das morphologische System der Saatgerste, Hordeum vulgare L. sl. Der Züchter 20, 8–24 (1950).

  25. 25.

    Komatsuda, T. et al. Six-rowed barley originated from a mutation in a homeodomain-leucine zipper I-class homeobox gene. Proc. Natl Acad. Sci. USA 104, 1424–1429 (2007).

  26. 26.

    Ramsay, L. et al. INTERMEDIUM-C, a modifier of lateral spikelet fertility in barley, is an ortholog of the maize domestication gene TEOSINTE BRANCHED 1. Nat. Genet. 43, 169–172 (2011).

  27. 27.

    Youssef, H. M. et al. Natural diversity of inflorescence architecture traces cryptic domestication genes in barley (Hordeum vulgare L.). Genet. Resour. Crop Evol. 64, 843–853 (2017).

  28. 28.

    Lundqvist, U. Hexastichon and intermedium mutants in barley. Hereditas 92, 229–236 (1980).

  29. 29.

    Taketa, S. et al. Barley grain with adhering hulls is controlled by an ERF family transcription factor gene regulating a lipid biosynthesis pathway. Proc. Natl Acad. Sci. USA 105, 4062–4067 (2008).

  30. 30.

    Elbaum, R., Zaltzman, L., Burgert, I. & Fratzl, P. The role of wheat awns in the seed dispersal unit. Science 316, 884–886 (2007).

  31. 31.

    Liller, C. B. et al. Fine mapping of a major QTL for awn length in barley using a multiparent mapping population. Theor. Appl. Genet. 130, 269–281 (2017).

  32. 32.

    Franckowiak, J. D. BGS 312; Smooth awn 1. Barley Genet. Newsl. 26, 261 (1997).

  33. 33.

    Hua, L. et al. LABA1, a domestication gene associated with long, barbed awns in wild rice. Plant Cell 27, 1875–1888 (2015).

  34. 34.

    Gottwald, S., Bauer, P., Komatsuda, T., Lundqvist, U. & Stein, N. TILLING in the two-rowed barley cultivar ‘Barke’ reveals preferred sites of functional diversity in the gene HvHox1. BMC Res. Notes 2, 258 (2009).

  35. 35.

    Åberg, E. & Wiebe, G. A. Classification of Barley Varieties Grown in the United States and Canada in 1945 (US Department of Agriculture, Washington, DC, USA, 1946).

  36. 36.

    Longin, C. F. H. & Reif, J. C. Redesigning the exploitation of wheat genetic resources. Trends Plant Sci. 19, 631–636 (2014).

  37. 37.

    González, M. Y. et al. Unlocking historical phenotypic data from an ex situ collection to enhance the informed utilization of genetic resources of barley (Hordeum sp.). Theoret. Appl. Genet. 131, 2009–2019 (2018).

  38. 38.

    Turner, A., Beales, J., Faure, S., Dunford, R. P. & Laurie, D. A. The pseudo-response regulator Ppd-H1 provides adaptation to photoperiod in barley. Science 310, 1031–1034 (2005).

  39. 39.

    Digel, B. et al. Photoperiod1 (Ppd-H1) controls leaf size. Plant Physiol. 172, 405–415 (2016).

  40. 40.

    Fu, D. et al. Large deletions within the first intron in VRN-1 are associated with spring growth habit in barley and wheat. Mol. Genet. Genom. 273, 54–65 (2005).

  41. 41.

    Yan, L. et al. The wheat and barley vernalization gene VRN3 is an orthologue of FT. Proc. Natl Acad. Sci. USA 103, 19581–19586 (2006).

  42. 42.

    Tondelli, A. et al. Allelic variation at Fr-H1/Vrn-H1 and Fr-H2 loci is the main determinant of frost tolerance in spring barley. Environ. Exp. Bot. 106, 148–155 (2014).

  43. 43.

    Richards, J. K., Friesen, T. L. & Brueggeman, R. S. Association mapping utilizing diverse barley lines reveals net form net blotch seedling resistance/susceptibility loci. Theoret. Appl. Genet. 130, 915–927 (2017).

  44. 44.

    Arora, S. et al. Resistance gene discovery and cloning by sequence capture and association genetics. Preprint at bioRxiv https://doi.org/10.1101/248146 (2018).

  45. 45.

    Jørgensen, I. H. Discovery, characterization and exploitation of Mlo powdery mildew resistance in barley. Euphytica 63, 141–152 (1992).

  46. 46.

    Stein, N. et al. The eukaryotic translation initiation factor 4E confers multiallelic recessive Bymovirus resistance in Hordeum vulgare (L.). Plant J. 42, 912–922 (2005).

  47. 47.

    Stracke, S. et al. Effects of introgression and recombination on haplotype structure and linkage disequilibrium surrounding a locus encoding Bymovirus resistance in barley. Genetics 175, 805–817 (2007).

  48. 48.

    Friedt, W. & Foroughi-Wehr, B. Genetics of resistance to barley yellow mosaic virus. In: Barley Genetics V (Yasuda, S. & Konishi, T., eds.) 659–664 (Sanjo Press, Okayama, Japan, 1987).

  49. 49.

    Humbroich, K. et al. Mapping of resistance against Barley mild mosaic virus‐Teik (BaMMV)—an rym5 resistance breaking strain of BaMMV—in the Taiwanese barley (Hordeum vulgare) cultivar ‘Taihoku A’. Plant Breeding 129, 346–348 (2010).

  50. 50.

    Romay, M. C. et al. Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol. 14, R55 (2013).

  51. 51.

    Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).

  52. 52.

    Yano, K. et al. Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat. Genet. 48, 927–934 (2016).

  53. 53.

    Jiang, Y., Schmidt, R. H., Zhao, Y. & Reif, J. C. A quantitative genetic framework highlights the role of epistatic effects for grain-yield heterosis in bread wheat. Nat. Genet. 49, 1741–1746 (2017).

  54. 54.

    Navarro, J. A. R. et al. A study of allelic diversity underlying flowering-time adaptation in maize landraces. Nat. Genet. 49, 476 (2017).

  55. 55.

    Zimmermann, G., Bäumlein, H., Mock, H.-P., Himmelbach, A. & Schweizer, P. The multigene family encoding germin-like proteins of barley. Regulation and function in basal host resistance. Plant Physiol. 142, 181–192 (2006).

  56. 56.

    Wendler, N. et al. Unlocking the secondary gene-pool of barley with next-generation sequencing. Plant Biotechnol. J. 12, 1122–1131 (2014).

  57. 57.

    Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).

  58. 58.

    Mascher, M. et al. A chromosome conformation capture ordered sequence of the barley genome. Nature 544, 427–433 (2017).

  59. 59.

    Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).

  60. 60.

    Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  61. 61.

    Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

  62. 62.

    Tange, O. GNU Parallel—the command-line power tool. ;login: 36, 42–47 (2011).

  63. 63.

    Zheng, X. et al. SeqArray—a storage-efficient high-performance data format for WGS variant calls. Bioinformatics 33, 2251–2257 (2017).

  64. 64.

    Arend, D. et al. e!DAL—a framework to store, share and publish research data. BMC Bioinformatics 15, 214 (2014).

  65. 65.

    Arend, D. et al. PGP repository: a plant phenomics and genomics data publication infrastructure. Database 2016, baw033 (2016).

  66. 66.

    R Development Core Team. R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, Vienna, 2017).

  67. 67.

    Galinsky, K. J. et al. Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia. Am. J. Hum. Genet. 98, 456–472 (2016).

  68. 68.

    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

  69. 69.

    Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).

  70. 70.

    Csardi, G. & Nepusz, T. The igraph software package for complex network research. InterJournal 1695, 1–9 (2006).

  71. 71.

    De Beukelaer, H., Davenport, G. F. & Fack, V. Core Hunter 3: flexible core subset selection. BMC Bioinformatics 19, 203 (2018).

  72. 72.

    Dray, S. & Dufour, A.-B. The ade4 package: implementing the duality diagram for ecologists. J. Stat. Softw. 22, 1–20 (2007).

  73. 73.

    Mansfeld, R. Das morphologische System der Saatgerste, Hordeum vulgare L. sl. Der Züchter 20, 8–24 (1950).

  74. 74.

    Clark, M. F. & Adams, A. Characteristics of the microplate method of enzyme-linked immunosorbent assay for the detection of plant viruses. J. Gen. Virol. 34, 475–483 (1977).

  75. 75.

    Lipka, A. E. et al. GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397–2399 (2012).

  76. 76.

    Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 7, https://doi.org/10.18637/jss.v067.i01 (2015).

  77. 77.

    Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).

  78. 78.

    Stich, B. et al. Comparison of mixed-model approaches for association mapping. Genetics 178, 1745–1754 (2008).

  79. 79.

    VanRaden, P. Genomic measures of relationship and inbreeding. Interbull Bull. 37, 33–36 (2007).

  80. 80.

    Smith, A., Cullis, B. & Gilmour, A. Applications: the analysis of crop variety evaluation data in Australia. Aust. N. Z. J. Stat. 43, 129–145 (2001).

  81. 81.

    Dunn, O. J. Multiple comparisons among means. J. Am. Stat. Assoc. 56, 52–64 (1961).

  82. 82.

    Mascher, M. et al. Barley whole exome capture: a tool for genomic research in the genus Hordeum and beyond. Plant J. 76, 494–505 (2013).

  83. 83.

    Mascher, M. et al. Mapping-by-sequencing accelerates forward genetics in barley. Genome Biol. 15, R78 (2014).

Download references

Acknowledgements

We thank G. Matzig, J. Pohl, M. Ziems, C. Fricke, M. Kretschmann, S. König, I. Walde, G. Schütze, A. Fiebig, J. Bauernfeind, T. Münch and D. Grau for technical assistance and G. Proeseler for initiating the long-term virus testing. We are grateful to H. de Beukelaer for Corehunter support. We thank B. Schierscher-Viret from the Swiss national genebank for providing seeds and K. Lipfert for artwork. This work was supported by a grant from the Leibniz Association to N.S., U.S., H.K., A.B., A.G. and J.C.R. (Pakt für Forschung und Innovation: SAW-2015-IPK-1 ‘BRIDGE’); by the German Ministry of Education and Research (BMBF; grant 031A536 ‘de.NBI’ to U.S.); by the Young Elite Scientists Sponsorship Program (2015QNRC001) from the China Association for Science and Technology (CAST); by a grant from the China Scholarship Council to G.G.; by funding from the China Agriculture Research System (CARS-05) and the Agricultural Science and Technology Innovation Program to J.Z.; and by the Swiss Federal Office for Agriculture in the framework of the National Plan of Action for the conservation and sustainable utilization of plant genetic resources (NAP-PGREL). S.G.M. acknowledges support from the German Academic Exchange service (DAAD) through a Leibniz-DAAD fellowship. Y.J. and M.Y.G. were supported by BMBF grants 031B0184A and 031B0190A, respectively. S.F. was supported by BMBF grants to F.O. and A.Habekuß (ViReCrop, FKZ: 0315708B; COBRA, FKZ: 031A323B).

Author information

Author notes

    • Matthias Jost

    Present address: Agriculture and Food, The Commonwealth Scientific and Industrial Research Organisation, Canberra, Australia

    • Rajiv Sharma

    Present address: University of Dundee at the James Hutton Institute, Invergowrie, UK

    • Raj K. Pasam

    Present address: Department of Economic Development, Jobs, Transport and Resources, Centre for AgriBioscience, Agriculture Victoria Research, Bundoora, Victoria, Australia

  1. These authors contributed equally: Sara G. Milner, Matthias Jost.

Affiliations

  1. Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany

    • Sara G. Milner
    • , Matthias Jost
    • , Elena Rey Mazón
    • , Axel Himmelbach
    • , Markus Oppermann
    • , Stephan Weise
    • , Helmut Knüpffer
    • , Martín Basterrechea
    • , Patrick König
    • , Danuta Schüler
    • , Rajiv Sharma
    • , Raj K. Pasam
    • , Twan Rutten
    • , Yong Jiang
    • , Maria Y. González
    • , Yusheng Zhao
    • , Matthias Lange
    • , Andreas Börner
    • , Andreas Graner
    • , Jochen C. Reif
    • , Uwe Scholz
    • , Martin Mascher
    •  & Nils Stein
  2. Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan

    • Shin Taketa
  3. Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China

    • Ganggang Guo
    • , Dongdong Xu
    •  & Jing Zhang
  4. Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland

    • Gerhard Herren
    • , Thomas Müller
    • , Simon G. Krattinger
    •  & Beat Keller
  5. Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

    • Simon G. Krattinger
  6. Institute for Resistance Research and Stress Tolerance, Julius Kühn Institute (Federal Research Centre for Cultivated Plants), Quedlinburg, Germany

    • Antje Habekuß
    • , Sandra Färber
    •  & Frank Ordon
  7. German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany

    • Martin Mascher
  8. Center for Integrated Breeding Research, Georg-August-Universität Göttingen, Göttingen, Germany

    • Nils Stein

Authors

  1. Search for Sara G. Milner in:

  2. Search for Matthias Jost in:

  3. Search for Shin Taketa in:

  4. Search for Elena Rey Mazón in:

  5. Search for Axel Himmelbach in:

  6. Search for Markus Oppermann in:

  7. Search for Stephan Weise in:

  8. Search for Helmut Knüpffer in:

  9. Search for Martín Basterrechea in:

  10. Search for Patrick König in:

  11. Search for Danuta Schüler in:

  12. Search for Rajiv Sharma in:

  13. Search for Raj K. Pasam in:

  14. Search for Twan Rutten in:

  15. Search for Ganggang Guo in:

  16. Search for Dongdong Xu in:

  17. Search for Jing Zhang in:

  18. Search for Gerhard Herren in:

  19. Search for Thomas Müller in:

  20. Search for Simon G. Krattinger in:

  21. Search for Beat Keller in:

  22. Search for Yong Jiang in:

  23. Search for Maria Y. González in:

  24. Search for Yusheng Zhao in:

  25. Search for Antje Habekuß in:

  26. Search for Sandra Färber in:

  27. Search for Frank Ordon in:

  28. Search for Matthias Lange in:

  29. Search for Andreas Börner in:

  30. Search for Andreas Graner in:

  31. Search for Jochen C. Reif in:

  32. Search for Uwe Scholz in:

  33. Search for Martin Mascher in:

  34. Search for Nils Stein in:

Contributions

N.S., M.M., U.S., J.C.R. and A.G. designed research. S.G.M., M.J., M.Y.G., Y.J., Y.Z. and M.M. analyzed data. A.B. supervised germplasm retrieval. G.H. optimized DNA extraction methods. A.Himmelbach performed GBS experiments. S.W., M.O. and H.K. managed, digitized and validated passport and phenotypic data. D.S., M.L. and U.S. managed sequence and phenotypic data. G.H., T.M., S.G.K. and B.K. contributed Swiss genebank accessions. G.G., D.X. and J.Z. contributed Chinese genebank accessions. M.J. and E.R.M. led phenotyping efforts. T.R. carried out microscopy. R.S. and R.K.P. mapped awn roughness. S.T. contributed expert knowledge on awn roughness. M.B., P.K., M.L. and U.S. implemented the online portal. A.Habekuß, S.F. and F.O. contributed data on virus resistance. S.G.M., M.J., N.S. and M.M. wrote the paper. All authors have read and approved the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Martin Mascher or Nils Stein.

Integrated supplementary information

  1. Supplementary Figure 1 Read depth and SNP distribution.

    a, Cumulative size of regions covered by GBS reads as a function of thresholds on missing data. The coverage analysis was performed on a random subset of 500 samples. Colors refer to the minimum read depth per sample. A total of 3 Mb of sequence is covered by at least two reads (dotted lines). b,c, Number of SNPs in 10-Mb windows along the barley reference genome: all SNPs (b) and SNPs with a minor allele frequency (MAF) ≥5% (c); SNPs with MAF ≥1%. Only variants with less than 10% missing data (Table 1) were considered.

  2. Supplementary Figure 2 Principal-component analysis of 19,778 domesticated barleys.

    PCA with 76,102 markers. Samples are colored according to geographic origin, row type, annual growth habit and domestication status. Color codes are defined in the inset map of a and the legends of b–d. The proportion of variance explained by the principal components is indicated in the axis labels of a. The proportion of variance explained by the principal components is indicated in the axis labels. The map was created with the R package mapdata.

  3. Supplementary Figure 3 PCA with 15,872 markers with a minor allele frequency  5 %.

    Samples are colored according to geographic origin. The color code is defined in the inset map. The proportion of variance explained by the principal components is indicated in the axis labels. The map was created with the R package mapdata.

  4. Supplementary Figure 4 ADMIXTURE results.

    (a), ADMIXTURE ancestry coefficients for k ranging from 2 to 12 for 17,640 samples with known countries of origin. The colored blocks below the bar plots correspond to the regional grouping of Fig. 1b and Supplementary Figs. 2 and 3. (b), Correspondence between ADMIXTURE and PCA. The same data points as in Supplementary Fig. 2 are shown. Samples are colored according to their assignment to ADMIXTURE groups (k = 12) in a. Samples whose highest ancestry coefficient is less than 70% are colored gray.

  5. Supplementary Figure 5 Comparison of ADMIXTURE runs with different k values.

    (a), Cross-validation (CV) errors of ADMIXTURE runs. CV errors for six replicate runs per k value are plotted. Jitter was added to x-axis coordinates. (b), Proportion of samples assigned to populations (q ≥ 0.7) for different values of k.

  6. Supplementary Figure 6 Decay of linkage disequilibrium in geographically defined germplasm groups of domesticated barley.

    Only variants with a minor allele frequency ≥ 1% were considered. Rolling means were used for smoothing. The geographically defined sample groups are a subset of the 10,183 non-redundant domesticated accessions of the IPK genebank.

  7. Supplementary Figure 7 Threshold for the identification of potential duplicates.

    The distribution (kernel density estimate) of pairwise homozygous difference between pairs of samples (sample set: 17,613 domesticated samples of the IPK genebank) is shown. A set of 72,200 SNPs segregating in this sample set was used. The red line (0.05%) marks the threshold for calling sample pairs potential duplicates.

  8. Supplementary Figure 8 Analysis of multiple individuals for 32 domesticated barley accessions from the IPK genebank.

    (a), Number of homozygous differences for pairs of individuals from the same accession. Country codes are used according to ISO 3166-1 alpha-3. Each dot represents a pair of individuals. The red line at 1,981 marks the 95th percentile of inter-accession differences, i.e., 95% of comparison between samples from different accessions have at least 1,981 homozygous differences. A set of 27,296 SNPs with a minor allele frequency of 0.05% for the identity-by-state analysis. (b), Percentage of heterozygous genotype calls per individual. Each dot represents the GBS sample of one individual. Only variants with minor allele frequency ≥ 20% were considered for b. Note that heterozygosity in the majority of samples is very low, as expected for any inbreeding crop.

  9. Supplementary Figure 9 Principal-component analysis of a core set of 1,000 diverse accessions from IPK’s genebank.

    The same data points as in Supplementary Fig. 2 are shown. Samples are colored blue if they are part of the core set. All other samples are shown in gray.

  10. Supplementary Figure 10 GWAS results without imputation and genetic differentiation between hulled and naked types.

    (ad), GWAS results for morphological traits using GBS SNPs with less than 10% missing data without imputation: row type (a), awn roughness (b), lemma adherence (c). d, FST between naked and hulled types in 1-Mb bins. The red lines in a and c indicate the significance threshold after correction for multiple testing using the Bonferroni method. GWAS scans were done using a mixed linear model approach with a sample set of 1,000 biologically independent individuals. FST was calculated using the method of Bhatia et al.69.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–10 and Supplementary Tables 2, 3, 5 and 7

  2. Reporting Summary

  3. Supplementary Table 1

    Passport data and sequence accession codes for 22,626 GBS samples

  4. Supplementary Table 4

    Passport data and scores for morphological characters for the core set of 1,000 accessions

  5. Supplementary Table 6

    Phenotypic data for virus resistance

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41588-018-0266-x