Abstract
Upland cotton is the most important natural-fiber crop. The genomic variation of diverse germplasms and alleles underpinning fiber quality and yield should be extensively explored. Here, we resequenced a core collection comprising 419 accessions with 6.55-fold coverage depth and identified approximately 3.66 million SNPs for evaluating the genomic variation. We performed phenotyping across 12 environments and conducted genome-wide association study of 13 fiber-related traits. 7,383 unique SNPs were significantly associated with these traits and were located within or near 4,820 genes; more associated loci were detected for fiber quality than fiber yield, and more fiber genes were detected in the D than the A subgenome. Several previously undescribed causal genes for days to flowering, fiber length, and fiber strength were identified. Phenotypic selection for these traits increased the frequency of elite alleles during domestication and breeding. These results provide targets for molecular selection and genetic manipulation in cotton improvement.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Zhang, J. F., Fang, H., Zhou, H. P., Sanogo, S. & Ma, Z. Y. Genetics, breeding, and marker-assisted selection for Verticillium wilt resistance in cotton. Crop Sci. 54, 1–15 (2014).
Wendel, J. F. New World tetraploid cottons contain Old World cytoplasm. Proc. Natl Acad. Sci. USA 86, 4132–4136 (1989).
Chen, Z. J. et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 145, 1303–1310 (2007).
Dai, P. et al. Construction of core collection of upland cotton based on phenotypic data. J. Plant Genetic Resour. 17, 961–968 (2016).
Wang, R. H. A brief history of the introduction of American cotton cultivars into China. Zhongguo Nong Ye Ke Xue 4, 30–35 (1983).
Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).
Wang, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587 (2017).
Huang, C. et al. Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol. J. 15, 1374–1386 (2017).
Sun, Z. et al. Genome-wide association study discovered genetic variation and candidate genes of fibre quality traits in Gossypium hirsutum L. Plant Biotechnol. J. 15, 982–996 (2017).
Brown, A. H. D. The case for core collection. in The Use of Plant Genetic Resources (eds. Brown, A. H. D. et al.) 136–156 (Cambridge Univ. Press, Cambridge, 1989).
Foulk, J., Meredith, W., Mcalister, D. & Luke, D. Fiber and yarn properties improve with new cotton cultivar. J. Cotton Sci. 13, 212–220 (2009).
Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).
Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).
Yano, K. et al. Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat. Genet. 48, 927–934 (2016).
Li, H. et al. Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat. Genet. 45, 43–50 (2013).
Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013).
Jia, G. et al. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013).
Huang, X. & Han, B. Natural variations and genome-wide association studies in crop plants. Annu. Rev. Plant Biol. 65, 531–551 (2014).
Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).
Dai, P. et al. Comprehensive evaluation and genetic diversity analysis of phenotypic traits of core collection in upland cotton. Zhongguo Nong Ye Ke Xue 49, 3694–3708 (2016).
Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).
Jiao, Y. et al. Genome-wide genetic changes during modern breeding of maize. Nat. Genet. 44, 812–815 (2012).
Wei, X. et al. Genetic discovery for oil production and quality in sesame. Nat. Commun. 6, 8609 (2015).
Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).
Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).
Fang, L. et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biol. 18, 33 (2017).
Kopp, M. & Hermisson, J. The evolution of genetic architecture under frequency-dependent disruptive selection. Evolution 60, 1537–1550 (2006).
Arioli, T. Genetic engineering for cotton fiber improvement. Pflanzenschutz-Nachrichten Bayer 58, 140–150 (2005).
Kim, H. J. & Triplett, B. A. Cotton fiber growth in planta and in vitro: models for plant cell elongation and cell wall biogenesis. Plant Physiol. 127, 1361–1366 (2001).
Deng, X. W. et al. COP1, an Arabidopsis regulatory gene, encodes a protein with both a zinc-binding motif and a G beta homologous domain. Cell 71, 791–801 (1992).
Albert, S. & Gallwitz, D. Msb4p, a protein involved in Cdc42p-dependent organization of the actin cytoskeleton, is a Ypt/Rab-specific GAP. Biol. Chem 381, 453–456 (2000).
Hussey, P. J., Ketelaar, T. & Deeks, M. J. Control of the actin cytoskeleton in plant cell growth. Annu. Rev. Plant Biol. 57, 109–125 (2006).
Staiger, C. J. & Blanchoin, L. Actin dynamics: old friends with new stories. Curr. Opin. Plant Biol. 9, 554–562 (2006).
Li, X. B., Fan, X. P., Wang, X. L., Cai, L. & Yang, W. C. The cotton ACTIN1 gene is functionally expressed in fibers and participates in fiber elongation. Plant Cell 17, 859–875 (2005).
Serna, L. & Martin, C. Trichomes: different regulatory networks lead to convergent structures. Trends Plant Sci. 11, 274–280 (2006).
Jégu, T. et al. Multiple functions of Kip-related protein5 connect endoreduplication and cell elongation. Plant Physiol. 161, 1694–1705 (2013).
Shi, Y. H. et al. Transcriptome profiling, molecular biological, and physiological studies reveal a major role for ethylene in cotton fiber cell elongation. Plant Cell 18, 651–664 (2006).
Beasley, C. A. Hormonal regulation of growth in unfertilized cotton ovules. Science 179, 1003–1005 (1973).
Beasley, C. A. & Ting, I. P. Effects of plant growth substances on in vitro fiber development from unfertilized cotton ovules. Am. J. Bot. 61, 188–194 (1974).
Gialvalis, S. & Seagull, R.W. Plant hormones alter fiber initiation in unfertilized, cultured ovules of Gossypium hirsutum. J. Cotton Sci. 5, 252–258 (2001).
Seagull, R. W. & Giavalis, S. Pre- and post-anthesis application of exogenous hormones alters fiber production in Gossypium hirsutum L. cultivar Maxxa GTO. J. Cotton Sci. 8, 105–111 (2004).
Zhang, M. et al. Spatiotemporal manipulation of auxin biosynthesis in cotton ovule epidermal cells enhances fiber yield and quality. Nat. Biotechnol. 29, 453–458 (2011).
Tseng, T. S., Swain, S. M. & Olszewski, N. E. Ectopic expression of the tetratricopeptide repeat domain of SPINDLY causes defects in gibberellin response. Plant Physiol. 126, 1250–1258 (2001).
Lin, Z. et al. SlTPR1, a tomato tetratricopeptide repeat protein, interacts with the ethylene receptors NR and LeETR1, modulating ethylene and auxin responses and development. J. Exp. Bot. 59, 4271–4287 (2008).
Lin, Z., Ho, C. W. & Grierson, D. AtTRP1 encodes a novel TPR protein that interacts with the ethylene receptor ERS1 and modulates development inArabidopsis. J. Exp. Bot. 60, 3697–3714 (2009).
Zhang, M. et al. A tetratricopeptide repeat domain-containing protein SSR1 located in mitochondria is involved in root development and auxin polar transport in Arabidopsis. Plant J. 83, 582–599 (2015).
May, O. L., Bowman, D. T. & Calhoun, D. S. Genetic diversity of U.S. upland cotton cultivars released between 1980 and 1990. Crop Sci. 35, 1570–1574 (1995).
Van Esbroeck, G. A., Bowman, D. T., Calhoun, D. S. & May, O. L. Changes in the genetic diversity of cotton in the USA from 1970 to 1995. Crop Sci. 38, 33–37 (1998).
Chen, G. & Du, X. M. Genetic diversity of source germplasm of upland cotton in China as determined by SSR marker analysis. Acta Genet. Sin. 33, 733–745 (2006).
Fang, D. D. et al. A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191, 391–401 (2013).
Tyagi, P. et al. Genetic diversity and population structure in the US upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127, 283–295 (2014).
Ingvarsson, P. K. & Street, N. R. Association genetics of complex traits in plants. New Phytol. 189, 909–922 (2011).
Korte, A. & Farlow, A. The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9, 29 (2013).
Long, A. D. & Langley, C. H. The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9, 720–731 (1999).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Poland, J. A., Bradbury, P. J., Buckler, E. S. & Nelson, R. J. Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proc. Natl Acad. Sci. USA 108, 6893–6898 (2011).
Pfaffl, M. W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45 (2001).
Senthil-Kumar, M. & Mysore, K. S. Tobacco rattle virus-based virus-induced gene silencing in Nicotiana benthamiana. Nat. Protoc. 9, 1549–1562 (2014).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Acknowledgements
We thank the National Mid-term Gene Bank for Cotton at the Cotton Research Institute, Chinese Academy of Agricultural Sciences, for providing the original collection seeds. We thank T. Zhang for releasing resequencing data for wild cotton accessions. This work was supported by the Fund of the China Agriculture Research System (CARS18-08) and the Science and Technology Support Program of Hebei Province (16226307D) to Z.M.; the National Major Science and Technology Program (2016ZX08005003-005) to X.W.; the National Key Research and Development Program (2016YFD0100203) to X.D., (2016YFD0101405) to Y.Z., and (2016YFD0100306) to S.H.; and the National Science and Technology Support Program (2013BAD01B03) to X.D.
Author information
Authors and Affiliations
Contributions
Z.M., X.W., X.D., and S.T. designed the analyses. Z.M., X.W., X.D., S.H., Y.Z., Zhihao Liu, and R.L. performed sequencing, genomic-variant, and GWAS analyses. X.W., G.Z., L. Wu, J.P., and S.T. managed the project. J.S., L. Wu, Z. Li, G.Z., J.Y., Y.J., Q.G., Z. Pan, X.L., Z.S., P.D., Zhengwen Liu, W.G., J. Wu, M.W., H. Liu, K.F., H.K., J. Wang, H. Lan, G.W., L. Wang, B.P., and Z. Peng performed field experiments and phenotyping. X.W., G.S., Y.J., Z.S., Zhengwen Liu, and N.W. performed data integration. Y.Z., Zhengwen Liu, and Z.S. performed transcriptome analyses. J.S., L. Wang, Y.J., and H.K. prepared the population material. Y.Z., Y.Y., and X.W. conducted gene expression analysis and functional validation. X.W. and Z.M. designed the research and wrote the manuscript. S.H., Y.Z., S.T., and X.D. designed the research and revised the manuscript. Z.M. and X.D. conceived the research.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Tables and Figures
Supplementary Figures 1–23 and Supplementary Tables 3, 6, 8, 9, 11–13 and 15
Supplementary Table 1
The list of 419 cotton accessions used in this study and their sequenced information
Supplementary Table 2
Statistics of different SNP mutation types for 419 accessions
Supplementary Table 4
Tracy-Widom statistics of eigenvalues from PCA analysis of 419 accessions
Supplementary Table 5
The ancestry proportion estimates for each accession when the ancestral population was specified as three
Supplementary Table 7
Number of SNP variation of different genes between core collection and wild races
Supplementary Table 10
List of the associated SNPs and genes for 13 traits
Supplementary Table 14
SNPs, elite alleles and their frequency of 13 traits in wild races, early- and modern-varieties
Rights and permissions
About this article
Cite this article
Ma, Z., He, S., Wang, X. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat Genet 50, 803–813 (2018). https://doi.org/10.1038/s41588-018-0119-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-018-0119-7
This article is cited by
-
Multiomics approaches to explore drought tolerance in cotton
Journal of Cotton Research (2024)
-
A genomic variation map provides insights into peanut diversity in China and associations with 28 agronomic traits
Nature Genetics (2024)
-
High-quality genome of a modern soybean cultivar and resequencing of 547 accessions provide insights into the role of structural variation
Nature Genetics (2024)
-
Genome-wide association study of fiber quality traits in US upland cotton (Gossypium hirsutum L.)
Theoretical and Applied Genetics (2024)
-
Genome-wide identification of the key kinesin genes during fiber and boll development in upland cotton (Gossypium hirsutum L.)
Molecular Genetics and Genomics (2024)