Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Haplotype-resolved sweet potato genome traces back its hexaploidization history


Here we present the 15 pseudochromosomes of sweet potato, Ipomoea batatas, the seventh most important crop in the world and the fourth most significant in China. By using a novel haplotyping method based on genome assembly, we have produced a half haplotype-resolved genome from ~296 Gb of paired-end sequence reads amounting to roughly 67-fold coverage. By phylogenetic tree analysis of homologous chromosomes, it was possible to estimate the time of two recent whole-genome duplication events as occurring about 0.8 and 0.5 million years ago. This half haplotype-resolved hexaploid genome represents the first successful attempt to investigate the complexity of chromosome sequence composition directly in a polyploid genome, using sequencing of the polyploid organism itself rather than any of its simplified proxy relatives. Adaptation and application of our approach should provide higher resolution in future genomic structure investigations, especially for similarly complex genomes.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Outline of current sweet potato genome assembly.
Fig. 2: Summary of variations.
Fig. 3: Illustration of seed-finding algorithm.
Fig. 4: Identified gene clusters in present I. batatas genome.
Fig. 5: Evolutionary history of cultivated I. batatas revealed by phylogenetic analysis of homologous chromosome regions.


  1. Crops (FAO, accessed 1 August 2017);

  2. Ozias-Akins, P. & Jarret, R. L. Nuclear DNA content and ploidy levels in the genus Ipomoea. J. Am. Soc. Hortic. Sci. 119, 110–115 (1994).

    CAS  Google Scholar 

  3. Ukoskit, K. & Thompson, P. G. Autopolyploidy versus allopolyoloidy and low-density randomly amplified polymorphic DNA linkage maps of sweetpotato. J. Am. Soc. Hortic. Sci. 122, 822–828 (1997).

    CAS  Google Scholar 

  4. Kriegner, A., Cervantes, J. C., Burg, K., Mwanga, R. O. M. & Zhang, D. A genetic linkage map of sweetpotato (Ipomoea batatas (L.) Lam.) based on AFLP markers. Mol. Breeding 11, 169–185 (2003).

    Article  CAS  Google Scholar 

  5. Hirakawa, H. et al. Survey of genome sequences in a wild sweet potato, Ipomoea trifida (H. B. K.) G. Don. DNA Res. 22, 171–179 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. The Potato Genome Sequencing Consortium. Genome sequence and analysis of the tuber crop potato. Nature 475, 189–195 (2011).

    Article  Google Scholar 

  7. Li, F. et al. Genome sequence of cultivated upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).

    Article  PubMed  Google Scholar 

  8. Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).

    Article  CAS  PubMed  Google Scholar 

  9. Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).

    Article  CAS  PubMed  Google Scholar 

  10. Ling, H. et al. Draft genome of the wheat A-genome progenitor Triticum urartu. Nature 496, 87–90 (2013).

    Article  CAS  PubMed  Google Scholar 

  11. Jia, J. et al. Aegilops tauschii draft genome sequence reveals a gene repertoire for wheat adaptation. Nature 496, 91–95 (2013).

    Article  CAS  PubMed  Google Scholar 

  12. The International Wheat Genome Sequencing Consortium. A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome. Science 345, 1251788 (2014).

    Article  Google Scholar 

  13. Choulet, F. et al. Structural and functional partitioning of bread wheat chromosome 3B. Science 345, 1249721 (2014).

    Article  PubMed  Google Scholar 

  14. Kajitani, R. et al. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Genome Res. 24, 1384–1395 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zimin, A. V. et al. The MaSuRCA genome assembler. Bioinformatics 29, 2669–2677 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Duitama, J. et al. Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of single individual haplotyping techniques. Nucleic Acids Res. 40, 2041–2053 (2012).

    Article  CAS  PubMed  Google Scholar 

  18. Cao, H. et al. De novo assembly of a haplotype-resolved human genome. Nat. Biotechnol. 33, 617–622 (2015).

    Article  CAS  PubMed  Google Scholar 

  19. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

    Article  PubMed Central  Google Scholar 

  20. Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2002).

    Article  Google Scholar 

  21. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at (2012).

  23. Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).

    Article  PubMed  Google Scholar 

  24. Hoshino, A., Jayakumar, V., Nitasaka, E., Toyoda, A., Noguchi, H., Itoh, T., Shin-I, T., Minakuchi, Y., Koda, Y. & Nagano, A. J. et al. Genome sequence and analysis of the Japanese morning glory Ipomoea nil. Nat. Commun. 7, 13295 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Niknafs, Y. S., Pandian, B., Iyer, H. K., Chinnaiyan, A. M. & Iyer, M. K. TACO produces robust multisample transcriptome assemblies from RNA-seq. Nat. Methods 14, 68–70 (2017).

    Article  CAS  Google Scholar 

  27. Smit, A. & Hubley, R. RepeatModeler - 1.0.8 (Institute for Systems Biology, 2015);

  28. Kyndt, T. et al. The genome of cultivated sweet potato contains Agrobacterium T-DNAs with expressed genes: an example of a naturally transgenic food crop. Proc. Natl Acad. Sci. USA 112(18), 5844–5849 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Nützmann, H. W. & Osbourn, A. Regulation of metabolic gene clusters in Arabidopsis thaliana. New Phytol. 205, 503–510 (2015).

    Article  PubMed  Google Scholar 

  30. Fernie, A. R. & Tohge, T. Location, location, location – no more! The unravelling of chromatin remodeling regulatory aspects of plant metabolic gene clusters. New Phytol. 205, 458–460 (2015).

    Article  PubMed  Google Scholar 

  31. Boycheva, S., Daviet, L., Wolfender, J. L. & Fitzpatrick, T. B. The rise of operon-like gene clusters in plants. Trends Plant Sci. 19, 447–459 (2014).

    Article  CAS  PubMed  Google Scholar 

  32. Slater, G. S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Kumar, S., Stecher, G., Peterson, D. & Tamura, K. MEGA-CC: computing core of molecular evolutionary genetics analysis program for automated and iterative data analysis. Bioinformatics 28, 2685–2686 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Ossowski, S. et al. The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science 327, 92–94 (2010).

    Article  CAS  PubMed  Google Scholar 

  35. Li, K. T. et al. Increased bioavailable vitamin B6 in field-grown transgenic cassava for dietary sufficiency. Nat. Biotechnol. 33, 1029–1032 (2015).

    Google Scholar 

  36. Kim, S. H. & Hamada, T. Rapid and reliable method of extracting DNA and RNA from sweetpotato, Ipomoea batatas (L). Lam. Biotechnol. Lett. 27, 1841–1845 (2005).

    Article  CAS  PubMed  Google Scholar 

  37. Firon, N. et al. Transcriptional profiling of sweetpotato (Ipomoea batatas) roots indicates down-regulation of lignin biosynthesis and up-regulation of starch biosynthesis at an early stage of storage root formation. BMC Genomics 14, 460 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Wang, Z. et al. De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics 11, 726 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Xie, F. et al. De novo sequencing and a comprehensive analysis of purple sweet potato (Ipomoea batatas L.) transcriptome. Planta 236, 101–113 (2012).

    Article  CAS  PubMed  Google Scholar 

  40. Tao, X. et al. Digital gene expression analysis based on integrated de novo transcriptome assembly of sweet potato [Ipomoea batatas (L.) Lam.]. PLoS ONE 7, e36234 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    Article  CAS  Google Scholar 

Download references


We thank J. Dai and Z. Nikoloski for helpful discussions during the haplotyping. We thank J. Zhu and Y. Jiang from Purdue University, J. Yu from Iowa State University and G. Gheysen from Ghent University for their invaluable comments during proofreading. J. Yang acknowledges support from the Alexander von Humboldt Foundation (Forschungsstipendium für erfahrene Wissenschaftler). M-Hossein Moeinzadeh acknowledges support from IMPRS-CBSC doctoral programme. This project was funded by the International Science & Technology Cooperation Program of China (2015DFG32370), the National Natural Science Foundation of China (31201254, 31361140366, 31501353), the National High Technology Research and Development Program of China (2011AA100607-4, 2012AA101204-3), the Chinese Academy of Sciences (2012KIP518), the China Postdoctoral Science Foundation (2012M520945), the Shanghai Municipal Afforestation & City Appearance and Environmental Sanitation Administration (G102410, F122422, F132427, G142434, G152429) and the Science and Technology Commission of Shanghai Municipality (14DZ2260400, 14ZR1414100).

Author information

Authors and Affiliations



J.Y., M-H.M., H.K., A.R.F., B.T., P.Z. and M.V. planned and coordinated the project and wrote the manuscript. G.-L.L., J.-L.Z. and Z.S. supplied the newly bred cultivar, Taizhong6. W.-J. F., G.-F.D. H.-X.W. and S.-S.Z. prepared genomic DNA. H.K. conducted the primary genome assembly and repeat sequence identification. J.Y. and M-H.M. conducted haplotyping and genome evolution analysis. S.B. managed part of sequencing work. J.H., P.X., S.H. and F.-H.H. supported and inspired a part of the analysis.

Corresponding authors

Correspondence to Peng Zhang or Martin Vingron.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Supplementary Figures 1–9, Supplementary Note.

Supplementary Table 1

Statistics of QC-passed reads and mapped sequence data obtained from all libraries.

Supplementary Table 2

Putative gene clusters list, yellow background indicates eight gene clusters shown in Fig. 4.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Moeinzadeh, MH., Kuhl, H. et al. Haplotype-resolved sweet potato genome traces back its hexaploidization history. Nature Plants 3, 696–703 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing