Most fruits in our daily diet are the products of domestication and breeding. Here we report a map of genome variation for a major fruit that encompasses ~3.6 million variants, generated by deep resequencing of 115 cucumber lines sampled from 3,342 accessions worldwide. Comparative analysis suggests that fruit crops underwent narrower bottlenecks during domestication than grain crops. We identified 112 putative domestication sweeps; 1 of these regions contains a gene involved in the loss of bitterness in fruits, an essential domestication trait of cucumber. We also investigated the genomic basis of divergence among the cultivated populations and discovered a natural genetic variant in a β-carotene hydroxylase gene that could be used to breed cucumbers with enhanced nutritional value. The genomic history of cucumber evolution uncovered here provides the basis for future genomics-enabled breeding.
At a glance
- Crop genomics: advances and applications. Nat. Rev. Genet. 13, 85–96 (2011). , &
- Maize HapMap2 identifies extant variation from a genome in flux. Nat. Genet. 44, 803–807 (2012). et al.
- A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012). et al.
- Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat. Genet. 44, 32–39 (2012). et al.
- Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012). et al.
- Genome-wide genetic changes during modern breeding of maize. Nat. Genet. 44, 812–815 (2012). et al.
- Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat. Genet. 42, 1053–1059 (2010). et al.
- Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat. Biotechnol. 30, 105–111 (2012). et al.
- Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat. Genet. 43, 956–963 (2011). et al.
- Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010). et al.
- Cucumber (Cucumis sativus) and melon (C. melo) have numerous wild relatives in Asia and Australia, and the sister species of melon is from Australia. Proc. Natl. Acad. Sci. USA 107, 14269–14273 (2010). , , &
- Genetic diversity and population structure of cucumber (Cucumis sativus L.). PLoS ONE 7, e46919 (2012). et al.
- The genome of the cucumber, Cucumis sativus L. Nat. Genet. 41, 1275–1281 (2009). et al.
- RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genomics 12, 540 (2011). et al.
- Construction of wild cucumber substitution lines. Acta Horticulturae Sinica 38, 886–892 (2011). et al.
- An integrated genetic and cytogenetic map of the cucumber genome. PLoS ONE 4, e5795 (2009). et al.
- A new type of cucumber—Cucumis sativus L. var. Xishuangbannanesis. Acta Horticulturae Sinica 10, 259–264 (1983). , &
- Evolutionary relationship of DNA sequences in finite populations. Genetics 105, 437–460 (1983).
- The molecular genetics of crop domestication. Cell 127, 1309–1321 (2006). , &
- Domestication and plant genomes. Curr. Opin. Plant Biol. 13, 160–166 (2010). , &
- Genetic perspectives on crop domestication. Trends Plant Sci. 15, 529–537 (2010). &
- Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5, e1000695 (2009). , , &
- Historical divergence and gene flow in the genus Zea. Genetics 181, 1399–1413 (2009). , &
- Genome-wide patterns of nucleotide polymorphism in domesticated rice. PLoS Genet. 3, 1745–1756 (2007). et al.
- An SNP caused loss of seed shattering during rice domestication. Science 312, 1392–1396 (2006). et al.
- Rice domestication by reducing shattering. Science 311, 1936–1939 (2006). , &
- The origin of the naked grains of maize. Nature 436, 714–719 (2005). et al.
- Impact of segmental chromosomal duplications on leaf size in the grandifolia-D mutants of Arabidopsis thaliana. Plant J. 60, 122–133 (2009). , , , &
- Role of cucurbitacin C in resistance to spider mite (Tetranychus urticae) in cucumber (Cucumis sativus L.). J. Chem. Ecol. 29, 225–235 (2003). et al.
- The inheritance of a bitter principle in cucumbers. Proc. Amer. Soc. Hort. Sci. 62, 441–442 (1953).
- Fine genetic mapping localizes cucumber scab resistance gene Ccu into an R gene cluster. Theor. Appl. Genet. 122, 795–803 (2011). et al.
- Inheritance and mapping of the ore gene controlling the quantity of β-carotene in cucumber (Cucumis sativus L.) endocarp. Mol. Breed. 30, 335–344 (2012). et al.
- Carotenoids and their cleavage products: biosynthesis and functions. Nat. Prod. Rep. 28, 663–692 (2011). &
- The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions. Nat. Genet. 45, 51–58 (2013). et al.
- Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 8, 4321–4325 (1980). &
- SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967 (2009). et al.
- Chromosome rearrangements during domestication of cucumber as revealed by high-density genetic mapping and draft genome assembly. Plant J. 71, 895–906 (2012). et al.
- De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20, 265–272 (2010). et al.
- SNP detection for massively parallel whole-genome resequencing. Genome Res. 19, 1124–1132 (2009). et al.
- Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx). Science 326, 433–436 (2009). et al.
- Genome-wide patterns of genetic variation among elite maize inbred lines. Nat. Genet. 42, 1027–1030 (2010). et al.
- GeneWise and Genomewise. Genome Res. 14, 988–995 (2004). , &
- New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010). et al.
- Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003). , &
- Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14, 2611–2620 (2005). , &
- PCO: A FORTRAN Computer Program for Principal Coordinate Analysis (Department of Statistics, University of Auckland, Auckland, New Zealand, 2003).
- Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005). , , &
- TAGster: efficient selection of LD tag SNPs in single or multiple populations. Bioinformatics 23, 3254–3255 (2007). , &
- Population differentiation as a test for selective sweeps. Genome Res. 20, 393–402 (2010). , &
- A step-by-step tutorial to use HierFstat to analyse populations hierarchically structured at multiple levels. Infect. Genet. Evol. 7, 731–735 (2007). &
- A portfolio of plasmids for identification and analysis of carotenoid pathway enzymes: Adonis aestivalis as a case study. Photosynth. Res. 92, 245–259 (2007). &
- Characterization of a second carotenoid β-hydroxylase gene from Arabidopsis and its relationship to the LUT1 locus. Plant Mol. Biol. 47, 379–388 (2001). &
- Supplementary Text and Figures (4,742 KB)
Supplementary Note, Supplementary Tables 2, 3, 5, 6 and 14, and Supplementary Figures 1–14
- Supplementary Table 1 (65 KB)
Summary of the sampled core collection
- Supplementary Table 4 (147 KB)
Presence and absence variation (PAV) genes identified in the core collection of 115 cucumber accessions
- Supplementary Table 7 (58 KB)
The SNP loci chosen for validation by PCR and Sanger sequencing
- Supplementary Table 8 (76 KB)
Putative regions identified to be under domestication sweeps
- Supplementary Table 9 (553 KB)
Genes within the putative regions identified to be under domestication sweeps
- Supplementary Table 10 (80 KB)
Summary of the genes present within Bt region
- Supplementary Table 11 (59 KB)
Highly differentiated regions across the cultivated groups
- Supplementary Table 12 (1,126 KB)
Genes located in the highly differentiated regions
- Supplementary Table 13 (453 KB)
Genes containing nonsynonymous SNPs of significantly high FST values
- Supplementary Dataset (6,661 KB)
Supplementary dataset for Supplementary Figures 1–5 and 7–10