Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits

Abstract

The ancestors of Gossypium arboreum and Gossypium herbaceum provided the A subgenome for the modern cultivated allotetraploid cotton. Here, we upgraded the G. arboreum genome assembly by integrating different technologies. We resequenced 243 G. arboreum and G. herbaceum accessions to generate a map of genome variations and found that they are equally diverged from Gossypium raimondii. Independent analysis suggested that Chinese G. arboreum originated in South China and was subsequently introduced to the Yangtze and Yellow River regions. Most accessions with domestication-related traits experienced geographic isolation. Genome-wide association study (GWAS) identified 98 significant peak associations for 11 agronomically important traits in G. arboreum. A nonsynonymous substitution (cysteine-to-arginine substitution) of GaKASIII seems to confer substantial fatty acid composition (C16:0 and C16:1) changes in cotton seeds. Resistance to fusarium wilt disease is associated with activation of GaGSTF9 expression. Our work represents a major step toward understanding the evolution of the A genome of cotton.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Genomic divergence and geographic-relationship analysis.
Fig. 2: GaKASIII regulates cotton seed oil content.
Fig. 3: A genetic locus that underwent geographical isolation confers resistance to fusarium wilt disease.
Fig. 4: Both GWAS and QTL analysis identified the same region in the G. arboreum genome as being potentially important for seed fuzz development.

Similar content being viewed by others

References

  1. Wendel, J. F., Flagel, L. E. & Adams, K. L. Jeans, genes, and genomes: cotton as a model for studying polyploidy. in Polyploidy and Genome Evolution (eds. Soltis, P. S. & Soltis, D. E.) 181–207 (Springer, Berlin and Heidelberg, 2012).

  2. Wendel, J. F., Brubaker, C. L. & Seelanan, T. The origin and evolution of Gossypium. in Physiology of Cotton (eds. Stewart, J. M. et al.) 1–18 (Springer Netherlands, Houten, the Netherlands, 2010).

  3. Watt, G. The Wild and Cultivated Cotton Plants of the World (Longmans, London, 1907).

    Google Scholar 

  4. Institute of Cotton Research, CAAS & Institute of Industrial Crops, JAAS. The Chinese Asiatic Cottons (ChinaAgriculture Press, Beijing, 1989).

    Google Scholar 

  5. Desai, A., Chee, P. W., Rong, J., May, O. L. & Paterson, A. H. Chromosome structural changes in diploid and tetraploid A genomes of Gossypium. Genome 49, 336–345 (2006).

    Article  PubMed  Google Scholar 

  6. Ma, X. X., Zhou, B. L., Lü, Y. H., Guo, W. Z. & Zhang, T. Z. Simple sequence repeat genetic linkage maps of A-genome diploid cotton (Gossypium arboreum). J. Integr. Plant Biol. 50, 491–502 (2008).

    Article  PubMed  CAS  Google Scholar 

  7. Stanton, M. A., Stewart, J. M., Pervical, A. E. & Wendel, J. F. Morphological diversity and relationships in the A-genome cottons, Gossypium arboreum and G. herbaceum. Crop Sci. 34, 519–527 (1994).

    Article  Google Scholar 

  8. Chen, Y. et al. A new synthetic amphiploid (AADDAA) between Gossypium hirsutum and G. arboreum lays the foundation for transferring resistances to Verticillium and drought. PLoS One 10, e0128981 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Kulkarni, V. N., Khadi, B. M., Maralappanavar, M. S., Deshapande, L. A. & Narayanan, S. S. The worldwide gene pools of Gossypium arboreum L. and G. herbaceum L. and their improvement. in Genetics and Genomics of Cotton (ed. Paterson, A. H.) 69–97 (Springer, New York, 2009).

  10. Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).

    Article  PubMed  CAS  Google Scholar 

  11. Paterson, A. H. et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492, 423–427 (2012).

    Article  PubMed  CAS  Google Scholar 

  12. Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).

    Article  PubMed  CAS  Google Scholar 

  13. Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).

    Article  PubMed  CAS  Google Scholar 

  14. Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).

    Article  PubMed  CAS  Google Scholar 

  15. Liu, X. et al. Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites. Sci. Rep. 5, 14139 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Yuan, D. et al. The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci. Rep. 5, 17662 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).

    Article  PubMed  CAS  Google Scholar 

  18. Huang, X. et al. Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat. Genet. 44, 32–39 (2011).

    Article  PubMed  CAS  Google Scholar 

  19. Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  20. Hufford, M. B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Chia, J. M. et al. Maize HapMap2 identifies extant variation from a genome in flux. Nat. Genet. 44, 803–807 (2012).

    Article  PubMed  CAS  Google Scholar 

  22. Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).

    Article  PubMed  CAS  Google Scholar 

  23. Jia, G. et al. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013).

    Article  PubMed  CAS  Google Scholar 

  24. Qi, J. et al. A genomic variation map provides insights into the genetic basis of cucumber domestication and diversity. Nat. Genet. 45, 1510–1515 (2013).

    Article  PubMed  CAS  Google Scholar 

  25. Lin, T. et al. Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 46, 1220–1226 (2014).

    Article  PubMed  CAS  Google Scholar 

  26. Wang, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587 (2017).

    Article  PubMed  CAS  Google Scholar 

  27. Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).

    Article  PubMed  CAS  Google Scholar 

  28. Wendel, J. F., Olson, P. D. & Stewart, J. M. Genetic diversity, introgression, and independent domestication of old world cultivated cottons. Am. J. Bot. 76, 1795–1806 (1989).

    Article  Google Scholar 

  29. Guo, W., Zhou, B. L., Yang, L. M., Wang, W. & Zhang, T. Z. Genetic diversity of landraces in Gossypium arboreum L. race sinense assessed with simple sequence repeat markers. J. Integr. Plant Biol. 48, 1008–1017 (2006).

    Article  CAS  Google Scholar 

  30. Olsen, K. M. & Wendel, J. F. A bountiful harvest: genomic insights into crop domestication phenotypes. Annu. Rev. Plant Biol. 64, 47–70 (2013).

    Article  PubMed  CAS  Google Scholar 

  31. Liu, Q., Singh, S. P. & Green, A. G. High-stearic and high-oleic cottonseed oils produced by hairpin RNA-mediated post-transcriptional gene silencing. Plant Physiol. 129, 1732–1743 (2002).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  32. Yu, N., Xiao, W. F., Zhu, J., Chen, X. Y. & Peng, C. C. The Jatropha curcas KASIII gene alters fatty acid composition of seeds in Arabidopsis thaliana. Biol. Plant. 59, 773–782 (2015).

    Article  CAS  Google Scholar 

  33. Turley, R. B. & Chapman, K. D. Ontogeny of cotton seeds: gametogenesis, embryogenesis, germination, and seedling growth. in Cotton Physiology (eds. Stewart, J. M. et al.) 332–341 (Springer Netherlands, Houten, the Netherlands, 2010).

  34. Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. E. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Oerke, E. C. Crop losses to pests. J. Agric. Sci. 144, 31–43 (2005).

    Article  Google Scholar 

  36. Edwards, R., Dixon, D. P. & Walbot, V. Plant glutathione S-transferases: enzymes with multiple functions in sickness and in health. Trends Plant Sci. 5, 193–198 (2000).

    Article  PubMed  CAS  Google Scholar 

  37. Roppolo, D. et al. A novel protein family mediates Casparian strip formation in the endodermis. Nature 473, 380–383 (2011).

    Article  PubMed  CAS  Google Scholar 

  38. Roppolo, D. et al. Functional and evolutionary analysis of the CASPARIAN STRIP MEMBRANE DOMAIN PROTEIN family. Plant Physiol. 165, 1709–1722 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Schnittger, A., Schöbinger, U., Stierhof, Y. D. & Hülskamp, M. Ectopic B-type cyclin expression induces mitotic cycles in endoreduplicating Arabidopsis trichomes. Curr. Biol. 12, 415–420 (2002).

    Article  PubMed  CAS  Google Scholar 

  40. Yang, C. et al. A regulatory gene induces trichome formation and embryo lethality in tomato. Proc. Natl Acad. Sci. USA 108, 11836–11841 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  42. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  43. Chin, C. S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).

    Article  PubMed  CAS  Google Scholar 

  44. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21  (Suppl. 1), i351–i358 (2005).

    Article  PubMed  CAS  Google Scholar 

  45. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Han, Y. & Wessler, S. R. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 38, e199 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21 (Suppl. 1), i152–i158 (2005). 

    Article  PubMed  CAS  Google Scholar 

  48. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 44, e89 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  PubMed  CAS  Google Scholar 

  53. Marchler-Bauer, A. et al. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 39, D225–D229 (2011).

    Article  PubMed  CAS  Google Scholar 

  54. Hunter, S. et al. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 40, D306–D312 (2012).

    Article  PubMed  CAS  Google Scholar 

  55. Dimmer, E. C. et al. The UniProt-GO Annotation database in 2011. Nucleic Acids Res. 40, D565–D570 (2012).

    Article  PubMed  CAS  Google Scholar 

  56. Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Paterson, A. H., Brubaker, C. L. & Wendel, J. F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Rep. 11, 122–127 (1993).

    Article  CAS  Google Scholar 

  58. Takagi, H. et al. QTL-seq: rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J. 74, 174–183 (2013).

    Article  PubMed  CAS  Google Scholar 

  59. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  60. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. Felsenstein, J. PHYLIP-phylogeny inference package (version 3.2). Cladistics 5, 163–166 (1989).

    Article  Google Scholar 

  63. Falush, D., Stephens, M. & Pritchard, J. K. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).

    PubMed  PubMed Central  CAS  Google Scholar 

  64. Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).

    Article  PubMed  CAS  Google Scholar 

  65. Haegi, A. et al. A newly developed real-time PCR assay for detection and quantification of Fusarium oxysporum and its use in compatible and incompatible interactions with grafted melon genotypes. Phytopathology 103, 802–810 (2013).

    Article  PubMed  CAS  Google Scholar 

  66. Dowd, M. K. et al. Fatty acid profiles of cottonseed genotypes from the national cotton variety trials. J. Cotton Sci. 14, 64–73 (2010).

    CAS  Google Scholar 

  67. Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Li, M. X., Yeung, J. M. Y., Cherny, S. S. & Sham, P. C. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012).

    Article  PubMed  CAS  Google Scholar 

  70. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26, 1641–1650 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  71. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by funding from the National Natural Science Foundation of China (grants 31621005 to F. Li and 90717009 to Y.Z.), the National Key Technology R&D Program, the Ministry of Science and Technology (2016YFD0100203 to X.D. and 2016YFD0100306 to S. He), the National Science and Technology Support Program, the Ministry of Agriculture (2013BAD01B03 to X.D.), the Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-IVFCAAS to S. Huang), and the leading talents of Guangdong Province Program (00201515 to S. Huang).

Author information

Authors and Affiliations

Authors

Contributions

F. Li, Y.Z., X.D., and T.L. conceived and designed the research. F. Li and S. Huang managed the project. T.L., N.L., M.L., F. Liu, F.W., H. Zheng., and G.S. performed the genome sequencing, assembly, and bioinformatics. X.D., S. He, J.S., Z.Y., X.M., X.Z., Y.J., Z. Pan., W.G., Z.L., H. Zhu., L.M., D.Y., Q.G., Z. Peng., L.W., S.X., and X.W. prepared the samples, performed phenotyping, and contributed to data analysis. Y.Z. designed the molecular experiments, and Z.Y. and G.H. performed the molecular experiments and led interpretation of the molecular-data analysis. S. He, Z.Y., and G.H. prepared the figures and tables. Y.Z., S. He, G.H., Z.Y., T.L., S. Huang, H.S., C.L., and W.F. wrote and revised the manuscript.

Corresponding authors

Correspondence to Tao Lin, Yuxian Zhu or Fuguang Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15 and Supplementary Tables 1–6

Reporting Summary

Supplementary Tables

Supplementary Tables 7–18

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, X., Huang, G., He, S. et al. Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits. Nat Genet 50, 796–802 (2018). https://doi.org/10.1038/s41588-018-0116-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-018-0116-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing