Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield

Abstract

Upland cotton is the most important natural-fiber crop. The genomic variation of diverse germplasms and alleles underpinning fiber quality and yield should be extensively explored. Here, we resequenced a core collection comprising 419 accessions with 6.55-fold coverage depth and identified approximately 3.66 million SNPs for evaluating the genomic variation. We performed phenotyping across 12 environments and conducted genome-wide association study of 13 fiber-related traits. 7,383 unique SNPs were significantly associated with these traits and were located within or near 4,820 genes; more associated loci were detected for fiber quality than fiber yield, and more fiber genes were detected in the D than the A subgenome. Several previously undescribed causal genes for days to flowering, fiber length, and fiber strength were identified. Phenotypic selection for these traits increased the frequency of elite alleles during domestication and breeding. These results provide targets for molecular selection and genetic manipulation in cotton improvement.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Phylogenetic tree, PCA, genetic structure and LD decay of the 419 accessions.
Fig. 2: Identification of the FD causal gene GhCIP1 on chromosome Dt03.
Fig. 3: Identification of the FD causal gene GhUCE on chromosome Dt03.
Fig. 4: Identification of the FL causal gene GhFL1 on chromosome At10.
Fig. 5: Identification of the FL causal gene GhFL2 on chromosome Dt11.
Fig. 6: Identification of the causal FS gene for the peak on chromosome At07.
Fig. 7

Similar content being viewed by others

References

  1. Zhang, J. F., Fang, H., Zhou, H. P., Sanogo, S. & Ma, Z. Y. Genetics, breeding, and marker-assisted selection for Verticillium wilt resistance in cotton. Crop Sci. 54, 1–15 (2014).

    Article  CAS  Google Scholar 

  2. Wendel, J. F. New World tetraploid cottons contain Old World cytoplasm. Proc. Natl Acad. Sci. USA 86, 4132–4136 (1989).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Chen, Z. J. et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 145, 1303–1310 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  4. Dai, P. et al. Construction of core collection of upland cotton based on phenotypic data. J. Plant Genetic Resour. 17, 961–968 (2016).

    Google Scholar 

  5. Wang, R. H. A brief history of the introduction of American cotton cultivars into China. Zhongguo Nong Ye Ke Xue 4, 30–35 (1983).

    Google Scholar 

  6. Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).

    Article  PubMed  CAS  Google Scholar 

  7. Wang, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587 (2017).

    Article  PubMed  CAS  Google Scholar 

  8. Huang, C. et al. Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol. J. 15, 1374–1386 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Sun, Z. et al. Genome-wide association study discovered genetic variation and candidate genes of fibre quality traits in Gossypium hirsutum L. Plant Biotechnol. J. 15, 982–996 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  10. Brown, A. H. D. The case for core collection. in The Use of Plant Genetic Resources (eds. Brown, A. H. D. et al.) 136–156 (Cambridge Univ. Press, Cambridge, 1989).

  11. Foulk, J., Meredith, W., Mcalister, D. & Luke, D. Fiber and yarn properties improve with new cotton cultivar. J. Cotton Sci. 13, 212–220 (2009).

    Google Scholar 

  12. Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).

    Article  PubMed  CAS  Google Scholar 

  13. Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).

    Article  PubMed  CAS  Google Scholar 

  14. Yano, K. et al. Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat. Genet. 48, 927–934 (2016).

    Article  PubMed  CAS  Google Scholar 

  15. Li, H. et al. Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat. Genet. 45, 43–50 (2013).

    Article  PubMed  CAS  Google Scholar 

  16. Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Jia, G. et al. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013).

    Article  PubMed  CAS  Google Scholar 

  18. Huang, X. & Han, B. Natural variations and genome-wide association studies in crop plants. Annu. Rev. Plant Biol. 65, 531–551 (2014).

    Article  PubMed  CAS  Google Scholar 

  19. Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).

    Article  PubMed  CAS  Google Scholar 

  20. Dai, P. et al. Comprehensive evaluation and genetic diversity analysis of phenotypic traits of core collection in upland cotton. Zhongguo Nong Ye Ke Xue 49, 3694–3708 (2016).

    Google Scholar 

  21. Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).

    Article  PubMed  CAS  Google Scholar 

  22. Jiao, Y. et al. Genome-wide genetic changes during modern breeding of maize. Nat. Genet. 44, 812–815 (2012).

    Article  PubMed  CAS  Google Scholar 

  23. Wei, X. et al. Genetic discovery for oil production and quality in sesame. Nat. Commun. 6, 8609 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).

    Article  PubMed  CAS  Google Scholar 

  25. Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).

    Article  PubMed  CAS  Google Scholar 

  26. Fang, L. et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biol. 18, 33 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Kopp, M. & Hermisson, J. The evolution of genetic architecture under frequency-dependent disruptive selection. Evolution 60, 1537–1550 (2006).

    Article  PubMed  CAS  Google Scholar 

  28. Arioli, T. Genetic engineering for cotton fiber improvement. Pflanzenschutz-Nachrichten Bayer 58, 140–150 (2005).

    Google Scholar 

  29. Kim, H. J. & Triplett, B. A. Cotton fiber growth in planta and in vitro: models for plant cell elongation and cell wall biogenesis. Plant Physiol. 127, 1361–1366 (2001).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Deng, X. W. et al. COP1, an Arabidopsis regulatory gene, encodes a protein with both a zinc-binding motif and a G beta homologous domain. Cell 71, 791–801 (1992).

    Article  PubMed  CAS  Google Scholar 

  31. Albert, S. & Gallwitz, D. Msb4p, a protein involved in Cdc42p-dependent organization of the actin cytoskeleton, is a Ypt/Rab-specific GAP. Biol. Chem 381, 453–456 (2000).

    Article  PubMed  CAS  Google Scholar 

  32. Hussey, P. J., Ketelaar, T. & Deeks, M. J. Control of the actin cytoskeleton in plant cell growth. Annu. Rev. Plant Biol. 57, 109–125 (2006).

    Article  PubMed  CAS  Google Scholar 

  33. Staiger, C. J. & Blanchoin, L. Actin dynamics: old friends with new stories. Curr. Opin. Plant Biol. 9, 554–562 (2006).

    Article  PubMed  CAS  Google Scholar 

  34. Li, X. B., Fan, X. P., Wang, X. L., Cai, L. & Yang, W. C. The cotton ACTIN1 gene is functionally expressed in fibers and participates in fiber elongation. Plant Cell 17, 859–875 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Serna, L. & Martin, C. Trichomes: different regulatory networks lead to convergent structures. Trends Plant Sci. 11, 274–280 (2006).

    Article  PubMed  CAS  Google Scholar 

  36. Jégu, T. et al. Multiple functions of Kip-related protein5 connect endoreduplication and cell elongation. Plant Physiol. 161, 1694–1705 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Shi, Y. H. et al. Transcriptome profiling, molecular biological, and physiological studies reveal a major role for ethylene in cotton fiber cell elongation. Plant Cell 18, 651–664 (2006).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Beasley, C. A. Hormonal regulation of growth in unfertilized cotton ovules. Science 179, 1003–1005 (1973).

    Article  PubMed  CAS  Google Scholar 

  39. Beasley, C. A. & Ting, I. P. Effects of plant growth substances on in vitro fiber development from unfertilized cotton ovules. Am. J. Bot. 61, 188–194 (1974).

    Article  CAS  Google Scholar 

  40. Gialvalis, S. & Seagull, R.W. Plant hormones alter fiber initiation in unfertilized, cultured ovules of Gossypium hirsutum. J. Cotton Sci. 5, 252–258 (2001).

    CAS  Google Scholar 

  41. Seagull, R. W. & Giavalis, S. Pre- and post-anthesis application of exogenous hormones alters fiber production in Gossypium hirsutum L. cultivar Maxxa GTO. J. Cotton Sci. 8, 105–111 (2004).

    CAS  Google Scholar 

  42. Zhang, M. et al. Spatiotemporal manipulation of auxin biosynthesis in cotton ovule epidermal cells enhances fiber yield and quality. Nat. Biotechnol. 29, 453–458 (2011).

    Article  PubMed  CAS  Google Scholar 

  43. Tseng, T. S., Swain, S. M. & Olszewski, N. E. Ectopic expression of the tetratricopeptide repeat domain of SPINDLY causes defects in gibberellin response. Plant Physiol. 126, 1250–1258 (2001).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  44. Lin, Z. et al. SlTPR1, a tomato tetratricopeptide repeat protein, interacts with the ethylene receptors NR and LeETR1, modulating ethylene and auxin responses and development. J. Exp. Bot. 59, 4271–4287 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Lin, Z., Ho, C. W. & Grierson, D. AtTRP1 encodes a novel TPR protein that interacts with the ethylene receptor ERS1 and modulates development inArabidopsis. J. Exp. Bot. 60, 3697–3714 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Zhang, M. et al. A tetratricopeptide repeat domain-containing protein SSR1 located in mitochondria is involved in root development and auxin polar transport in Arabidopsis. Plant J. 83, 582–599 (2015).

    Article  PubMed  CAS  Google Scholar 

  47. May, O. L., Bowman, D. T. & Calhoun, D. S. Genetic diversity of U.S. upland cotton cultivars released between 1980 and 1990. Crop Sci. 35, 1570–1574 (1995).

    Article  Google Scholar 

  48. Van Esbroeck, G. A., Bowman, D. T., Calhoun, D. S. & May, O. L. Changes in the genetic diversity of cotton in the USA from 1970 to 1995. Crop Sci. 38, 33–37 (1998).

    Article  Google Scholar 

  49. Chen, G. & Du, X. M. Genetic diversity of source germplasm of upland cotton in China as determined by SSR marker analysis. Acta Genet. Sin. 33, 733–745 (2006).

    Article  PubMed  CAS  Google Scholar 

  50. Fang, D. D. et al. A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191, 391–401 (2013).

    Article  CAS  Google Scholar 

  51. Tyagi, P. et al. Genetic diversity and population structure in the US upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127, 283–295 (2014).

    Article  PubMed  Google Scholar 

  52. Ingvarsson, P. K. & Street, N. R. Association genetics of complex traits in plants. New Phytol. 189, 909–922 (2011).

    Article  PubMed  Google Scholar 

  53. Korte, A. & Farlow, A. The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9, 29 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  54. Long, A. D. & Langley, C. H. The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9, 720–731 (1999).

    PubMed  PubMed Central  CAS  Google Scholar 

  55. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  60. Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Poland, J. A., Bradbury, P. J., Buckler, E. S. & Nelson, R. J. Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proc. Natl Acad. Sci. USA 108, 6893–6898 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Pfaffl, M. W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45 (2001).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  63. Senthil-Kumar, M. & Mysore, K. S. Tobacco rattle virus-based virus-induced gene silencing in Nicotiana benthamiana. Nat. Protoc. 9, 1549–1562 (2014).

    Article  PubMed  CAS  Google Scholar 

  64. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Acknowledgements

We thank the National Mid-term Gene Bank for Cotton at the Cotton Research Institute, Chinese Academy of Agricultural Sciences, for providing the original collection seeds. We thank T. Zhang for releasing resequencing data for wild cotton accessions. This work was supported by the Fund of the China Agriculture Research System (CARS18-08) and the Science and Technology Support Program of Hebei Province (16226307D) to Z.M.; the National Major Science and Technology Program (2016ZX08005003-005) to X.W.; the National Key Research and Development Program (2016YFD0100203) to X.D., (2016YFD0101405) to Y.Z., and (2016YFD0100306) to S.H.; and the National Science and Technology Support Program (2013BAD01B03) to X.D.

Author information

Authors and Affiliations

Authors

Contributions

Z.M., X.W., X.D., and S.T. designed the analyses. Z.M., X.W., X.D., S.H., Y.Z., Zhihao Liu, and R.L. performed sequencing, genomic-variant, and GWAS analyses. X.W., G.Z., L. Wu, J.P., and S.T. managed the project. J.S., L. Wu, Z. Li, G.Z., J.Y., Y.J., Q.G., Z. Pan, X.L., Z.S., P.D., Zhengwen Liu, W.G., J. Wu, M.W., H. Liu, K.F., H.K., J. Wang, H. Lan, G.W., L. Wang, B.P., and Z. Peng performed field experiments and phenotyping. X.W., G.S., Y.J., Z.S., Zhengwen Liu, and N.W. performed data integration. Y.Z., Zhengwen Liu, and Z.S. performed transcriptome analyses. J.S., L. Wang, Y.J., and H.K. prepared the population material. Y.Z., Y.Y., and X.W. conducted gene expression analysis and functional validation. X.W. and Z.M. designed the research and wrote the manuscript. S.H., Y.Z., S.T., and X.D. designed the research and revised the manuscript. Z.M. and X.D. conceived the research.

Corresponding authors

Correspondence to Zhiying Ma, Xingfen Wang, Shilin Tian or Xiongming Du.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Tables and Figures

Supplementary Figures 1–23 and Supplementary Tables 3, 6, 8, 9, 11–13 and 15

Reporting Summary

Supplementary Table 1

The list of 419 cotton accessions used in this study and their sequenced information

Supplementary Table 2

Statistics of different SNP mutation types for 419 accessions

Supplementary Table 4

Tracy-Widom statistics of eigenvalues from PCA analysis of 419 accessions

Supplementary Table 5

The ancestry proportion estimates for each accession when the ancestral population was specified as three

Supplementary Table 7

Number of SNP variation of different genes between core collection and wild races

Supplementary Table 10

List of the associated SNPs and genes for 13 traits

Supplementary Table 14

SNPs, elite alleles and their frequency of 13 traits in wild races, early- and modern-varieties

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Z., He, S., Wang, X. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat Genet 50, 803–813 (2018). https://doi.org/10.1038/s41588-018-0119-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-018-0119-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing