Maize HapMap2 identifies extant variation from a genome in flux

Article metrics


Whereas breeders have exploited diversity in maize for yield improvements, there has been limited progress in using beneficial alleles in undomesticated varieties. Characterizing standing variation in this complex genome has been challenging, with only a small fraction of it described to date. Using a population genetics scoring model, we identified 55 million SNPs in 103 lines across pre-domestication and domesticated Zea mays varieties, including a representative from the sister genus Tripsacum. We find that structural variations are pervasive in the Z. mays genome and are enriched at loci associated with important traits. By investigating the drivers of genome size variation, we find that the larger Tripsacum genome can be explained by transposable element abundance rather than an allopolyploid origin. In contrast, intraspecies genome size variation seems to be controlled by chromosomal knob content. There is tremendous overlap in key gene content in maize and Tripsacum, suggesting that adaptations from Tripsacum (for example, perennialism and frost and drought tolerance) can likely be integrated into maize.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Deriving a high-quality variation map from a fluid genome.
Figure 2: The impact of SNPs and RDVs on phenotype.
Figure 3: Correlation between knobs, transposable elements and genome size in maize.

Accession codes

Primary accessions

Sequence Read Archive

Referenced accessions

NCBI Reference Sequence


  1. 1

    Tenaillon, M.I. et al. Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc. Natl. Acad. Sci. USA 98, 9161–9166 (2001).

  2. 2

    Wright, S.I. et al. The effects of artificial selection on the maize genome. Science 308, 1310–1314 (2005).

  3. 3

    Fu, H. & Dooner, H. Intraspecific violation of genetic colinearity and its implications in maize. Proc. Natl. Acad. Sci. USA 99, 9573–9578 (2002).

  4. 4

    Morgante, M., de Paoli, E. & Radovic, S. Transposable elements and the plant pan-genomes. Curr. Opin. Plant Biol. 10, 149–155 (2007).

  5. 5

    Swanson-Wagner, R.A. et al. Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. Genome Res. 20, 1689–1699 (2010).

  6. 6

    Baucom, R.S. et al. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome. PLoS Genet. 5, e1000732 (2009).

  7. 7

    Schnable, P.S. et al. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009).

  8. 8

    Le Rouzic, A., Boutin, T.S. & Capy, P. Long-term evolution of transposable elements. Proc. Natl. Acad. Sci. USA 104, 19375–19380 (2007).

  9. 9

    Gore, M.A. et al. A first-generation haplotype map of maize. Science 326, 1115–1117 (2009).

  10. 10

    Brunner, S., Fengler, K., Morgante, M., Tingey, S. & Rafalski, A. Evolution of DNA sequence nonhomologies among maize inbreds. Plant Cell 17, 343–360 (2005).

  11. 11

    McMullen, M.D. et al. Genetic properties of the maize nested association mapping population. Science 325, 737–740 (2009).

  12. 12

    Li, R. et al. SNP detection for massively parallel whole-genome resequencing. Genome Res. 19, 1124–1132 (2009).

  13. 13

    Li, H. & Homer, N. A survey of sequence alignment algorithms for next-generation sequencing. Brief. Bioinform. 11, 473–483 (2010).

  14. 14

    Clark, R.M. et al. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317, 338–342 (2007).

  15. 15

    Rizzon, C., Ponger, L. & Gaut, B.S. Striking similarities in the genomic distribution of tandemly arrayed genes in Arabidopsis and rice. PLoS Comput. Biol. 2, e115 (2006).

  16. 16

    Lai, J. et al. Genome-wide patterns of genetic variation among elite maize inbred lines. Nat. Genet. 42, 1027–1030 (2010).

  17. 17

    Kump, K.L. et al. Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nat. Genet. 43, 163–168 (2011).

  18. 18

    Poland, J.A., Bradbury, P.J., Buckler, E.S. & Nelson, R.J. Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proc. Natl. Acad. Sci. USA 108, 6893–6898 (2011).

  19. 19

    Tian, F. et al. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162 (2011).

  20. 20

    SanMiguel, P., Gaut, B.S., Tikhonov, A., Nakajima, Y. & Bennetzen, J.L. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20, 43–45 (1998).

  21. 21

    Albert, P.S., Gao, Z., Danilova, T.V. & Birchler, J.A. Diversity of chromosomal karyotypes in maize and its relatives. Cytogenet. Genome Res. 129, 6–16 (2010).

  22. 22

    Laurie, D. & Bennett, M. Nuclear DNA content in the genera Zea and Sorghum. Intergeneric, interspecific and intraspecific variation. Heredity 55, 307–313 (1985).

  23. 23

    Poggio, L., Rosato, M. & Chiavarino, A. Genome size and environmental correlations in maize (Zea mays ssp. mays, Poaceae). Ann. Bot. 82, 107–115 (1998).

  24. 24

    Tenaillon, M.I., Hufford, M.B., Gaut, B.S. & Ross-Ibarra, J. Genome size and transposable element content as determined by high-throughput sequencing in maize and Zea luxurians. Genome Biol. Evol. 3, 219–229 (2011).

  25. 25

    Meyers, B.C., Tingey, S.V. & Morgante, M. Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. Genome Res. 11, 1660–1676 (2001).

  26. 26

    Piegu, B. et al. Doubling genome size without polyploidization: dynamics of retrotransposition-driven genomic expansions in Oryza australiensis, a wild relative of rice. Genome Res. 16, 1262–1269 (2006).

  27. 27

    Anderson, E. Cytological observations on Tripsacum dactyloides. Ann. Mo. Bot. Gard. 31, 317–323 (1944).

  28. 28

    Gaut, B.S., Le Thierry Ennequin, M., Peek, A.S. & Sawkins, M.C. Maize as a model for the evolution of plant nuclear genomes. Proc. Natl. Acad. Sci. USA 97, 7008–7015 (2000).

  29. 29

    Paterson, A.H. et al. The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551–556 (2009).

  30. 30

    Hufford, M.B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. published online, doi:10.1038/ng.2309 (3 June 2012).

  31. 31

    Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

  32. 32

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  33. 33

    Li, R. et al. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967 (2009).

  34. 34

    Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).

  35. 35

    Hill, W.G. & Robertson, A. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38, 226–231 (1968).

  36. 36

    Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

  37. 37

    Gentleman, R.C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

  38. 38

    Ning, Z., Cox, A.J. & Mullikin, J.C. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729 (2001).

  39. 39

    Arumuganathan, K. & Earle, E.D. Estimation of nuclear DNA content of plants by flow cytometry. Plant Mol. Biol. Rep. 9, 229–241 (1991).

  40. 40

    Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

Download references


This work was supported by the US National Science Foundation (DBI-0820619, 0321467, 0703908, 0638566 and IOS-092270), the USDA-ARS, the USDA–National Institute of Food and Agriculture (NIFA) (2009- 01864), the US DOE (BER DE-FC02-07ER44494 and DE-AC02-03CH11211), The Rockefeller Foundation, the Bill and Melinda Gates Foundation, the Generation Challenge Program, the Chinese 971 program (2007CB813701, 2007CB813701 and 2007CB813703), the National Natural Science Foundation of China (NSFC) to Young Scientists (10723008), Guangdong Innovation Team Funding, the Chinese Ministry of Agriculture 984 program (2010-Z11), the National High Technology Research and Development Program of China (2009AA10AA03-2) and the National Basic Research Program of China (2007CB108900).

Author information

The manuscript was prepared by J.-M.C., B.G., E.S.B., M.D.M., J.R.-I. and D.W. Data analyses (including read mapping, variant detection, scoring and functional analyses) were performed by J.-M.C., C.S., J.C.G., M.G., M.B.H., T.P., Q.S., M.I.T., X.X., J.R.-I. and E.S.B. Transposon mapping and genome size analyses were performed by J.-M.C., M.G., D.C., M.I.T., J.R.-I. and B.G. Tripsacum analyses were provided by Q.S., D.C., J.C.G. and E.S.B. GWAS analyses were performed by P.J.B., M.L., F.T. and Z.Z. N.d.L., R.N., J.P., R.S.S. and S.M.K. provided early access data. J.D., R.J.E., L.G., J.C.G., K.E.G., J.H., J.L., X.L., Y.L., R.M., B.M.P., T.R., J.W., S.M.K., X.X., M.D.M., G.Z. and Y.X. provided germplasm management, developed DNA libraries and/or performed sequencing experiments. J.H., J.L., J.W., M.D.M., X.X., E.S.B. and D.W. provided experimental design and coordination.

Correspondence to Michael D McMullen or Edward S Buckler or Gengyun Zhang or Yunbi Xu or Doreen Ware.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Note, Supplementary Figures 1–11 and Supplementary Tables 1, 2, 4–8, 10, 12–17, 19–25 and 27 (PDF 3583 kb)

Supplementary Table 3

Identity by descent blocks (FILE: SuppTable3.xlsx) (XLSX 90 kb)

Supplementary Table 9

GWAS results for the 5 tested traits. (FILE: SuppTable9.xlsx) (XLSX 396 kb)

Supplementary Table 11

TE family abundance in each inbred line (FILE: SuppTable11.xlsx) (XLSX 1310 kb)

Supplementary Table 18

Differences in TE abundance between Tripsacum and maize (FILE: SuppTable18.xlsx) (XLSX 252 kb)

Supplementary Table 26

Orthologs of the most RDV variant maize genes (FILE: SuppTable26.xlsx) (XLSX 55 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading