Upland cotton is the most important natural-fiber crop. The genomic variation of diverse germplasms and alleles underpinning fiber quality and yield should be extensively explored. Here, we resequenced a core collection comprising 419 accessions with 6.55-fold coverage depth and identified approximately 3.66 million SNPs for evaluating the genomic variation. We performed phenotyping across 12 environments and conducted genome-wide association study of 13 fiber-related traits. 7,383 unique SNPs were significantly associated with these traits and were located within or near 4,820 genes; more associated loci were detected for fiber quality than fiber yield, and more fiber genes were detected in the D than the A subgenome. Several previously undescribed causal genes for days to flowering, fiber length, and fiber strength were identified. Phenotypic selection for these traits increased the frequency of elite alleles during domestication and breeding. These results provide targets for molecular selection and genetic manipulation in cotton improvement.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Zhang, J. F., Fang, H., Zhou, H. P., Sanogo, S. & Ma, Z. Y. Genetics, breeding, and marker-assisted selection for Verticillium wilt resistance in cotton. Crop Sci. 54, 1–15 (2014).
Wendel, J. F. New World tetraploid cottons contain Old World cytoplasm. Proc. Natl Acad. Sci. USA 86, 4132–4136 (1989).
Chen, Z. J. et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 145, 1303–1310 (2007).
Dai, P. et al. Construction of core collection of upland cotton based on phenotypic data. J. Plant Genetic Resour. 17, 961–968 (2016).
Wang, R. H. A brief history of the introduction of American cotton cultivars into China. Zhongguo Nong Ye Ke Xue 4, 30–35 (1983).
Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).
Wang, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587 (2017).
Huang, C. et al. Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol. J. 15, 1374–1386 (2017).
Sun, Z. et al. Genome-wide association study discovered genetic variation and candidate genes of fibre quality traits in Gossypium hirsutum L. Plant Biotechnol. J. 15, 982–996 (2017).
Brown, A. H. D. The case for core collection. in The Use of Plant Genetic Resources (eds. Brown, A. H. D. et al.) 136–156 (Cambridge Univ. Press, Cambridge, 1989).
Foulk, J., Meredith, W., Mcalister, D. & Luke, D. Fiber and yarn properties improve with new cotton cultivar. J. Cotton Sci. 13, 212–220 (2009).
Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).
Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).
Yano, K. et al. Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice. Nat. Genet. 48, 927–934 (2016).
Li, H. et al. Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat. Genet. 45, 43–50 (2013).
Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013).
Jia, G. et al. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013).
Huang, X. & Han, B. Natural variations and genome-wide association studies in crop plants. Annu. Rev. Plant Biol. 65, 531–551 (2014).
Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat. Genet. 44, 1098–1103 (2012).
Dai, P. et al. Comprehensive evaluation and genetic diversity analysis of phenotypic traits of core collection in upland cotton. Zhongguo Nong Ye Ke Xue 49, 3694–3708 (2016).
Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).
Jiao, Y. et al. Genome-wide genetic changes during modern breeding of maize. Nat. Genet. 44, 812–815 (2012).
Wei, X. et al. Genetic discovery for oil production and quality in sesame. Nat. Commun. 6, 8609 (2015).
Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).
Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).
Fang, L. et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biol. 18, 33 (2017).
Kopp, M. & Hermisson, J. The evolution of genetic architecture under frequency-dependent disruptive selection. Evolution 60, 1537–1550 (2006).
Arioli, T. Genetic engineering for cotton fiber improvement. Pflanzenschutz-Nachrichten Bayer 58, 140–150 (2005).
Kim, H. J. & Triplett, B. A. Cotton fiber growth in planta and in vitro: models for plant cell elongation and cell wall biogenesis. Plant Physiol. 127, 1361–1366 (2001).
Deng, X. W. et al. COP1, an Arabidopsis regulatory gene, encodes a protein with both a zinc-binding motif and a G beta homologous domain. Cell 71, 791–801 (1992).
Albert, S. & Gallwitz, D. Msb4p, a protein involved in Cdc42p-dependent organization of the actin cytoskeleton, is a Ypt/Rab-specific GAP. Biol. Chem 381, 453–456 (2000).
Hussey, P. J., Ketelaar, T. & Deeks, M. J. Control of the actin cytoskeleton in plant cell growth. Annu. Rev. Plant Biol. 57, 109–125 (2006).
Staiger, C. J. & Blanchoin, L. Actin dynamics: old friends with new stories. Curr. Opin. Plant Biol. 9, 554–562 (2006).
Li, X. B., Fan, X. P., Wang, X. L., Cai, L. & Yang, W. C. The cotton ACTIN1 gene is functionally expressed in fibers and participates in fiber elongation. Plant Cell 17, 859–875 (2005).
Serna, L. & Martin, C. Trichomes: different regulatory networks lead to convergent structures. Trends Plant Sci. 11, 274–280 (2006).
Jégu, T. et al. Multiple functions of Kip-related protein5 connect endoreduplication and cell elongation. Plant Physiol. 161, 1694–1705 (2013).
Shi, Y. H. et al. Transcriptome profiling, molecular biological, and physiological studies reveal a major role for ethylene in cotton fiber cell elongation. Plant Cell 18, 651–664 (2006).
Beasley, C. A. Hormonal regulation of growth in unfertilized cotton ovules. Science 179, 1003–1005 (1973).
Beasley, C. A. & Ting, I. P. Effects of plant growth substances on in vitro fiber development from unfertilized cotton ovules. Am. J. Bot. 61, 188–194 (1974).
Gialvalis, S. & Seagull, R.W. Plant hormones alter fiber initiation in unfertilized, cultured ovules of Gossypium hirsutum. J. Cotton Sci. 5, 252–258 (2001).
Seagull, R. W. & Giavalis, S. Pre- and post-anthesis application of exogenous hormones alters fiber production in Gossypium hirsutum L. cultivar Maxxa GTO. J. Cotton Sci. 8, 105–111 (2004).
Zhang, M. et al. Spatiotemporal manipulation of auxin biosynthesis in cotton ovule epidermal cells enhances fiber yield and quality. Nat. Biotechnol. 29, 453–458 (2011).
Tseng, T. S., Swain, S. M. & Olszewski, N. E. Ectopic expression of the tetratricopeptide repeat domain of SPINDLY causes defects in gibberellin response. Plant Physiol. 126, 1250–1258 (2001).
Lin, Z. et al. SlTPR1, a tomato tetratricopeptide repeat protein, interacts with the ethylene receptors NR and LeETR1, modulating ethylene and auxin responses and development. J. Exp. Bot. 59, 4271–4287 (2008).
Lin, Z., Ho, C. W. & Grierson, D. AtTRP1 encodes a novel TPR protein that interacts with the ethylene receptor ERS1 and modulates development inArabidopsis. J. Exp. Bot. 60, 3697–3714 (2009).
Zhang, M. et al. A tetratricopeptide repeat domain-containing protein SSR1 located in mitochondria is involved in root development and auxin polar transport in Arabidopsis. Plant J. 83, 582–599 (2015).
May, O. L., Bowman, D. T. & Calhoun, D. S. Genetic diversity of U.S. upland cotton cultivars released between 1980 and 1990. Crop Sci. 35, 1570–1574 (1995).
Van Esbroeck, G. A., Bowman, D. T., Calhoun, D. S. & May, O. L. Changes in the genetic diversity of cotton in the USA from 1970 to 1995. Crop Sci. 38, 33–37 (1998).
Chen, G. & Du, X. M. Genetic diversity of source germplasm of upland cotton in China as determined by SSR marker analysis. Acta Genet. Sin. 33, 733–745 (2006).
Fang, D. D. et al. A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191, 391–401 (2013).
Tyagi, P. et al. Genetic diversity and population structure in the US upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127, 283–295 (2014).
Ingvarsson, P. K. & Street, N. R. Association genetics of complex traits in plants. New Phytol. 189, 909–922 (2011).
Korte, A. & Farlow, A. The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9, 29 (2013).
Long, A. D. & Langley, C. H. The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9, 720–731 (1999).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Poland, J. A., Bradbury, P. J., Buckler, E. S. & Nelson, R. J. Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proc. Natl Acad. Sci. USA 108, 6893–6898 (2011).
Pfaffl, M. W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45 (2001).
Senthil-Kumar, M. & Mysore, K. S. Tobacco rattle virus-based virus-induced gene silencing in Nicotiana benthamiana. Nat. Protoc. 9, 1549–1562 (2014).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
We thank the National Mid-term Gene Bank for Cotton at the Cotton Research Institute, Chinese Academy of Agricultural Sciences, for providing the original collection seeds. We thank T. Zhang for releasing resequencing data for wild cotton accessions. This work was supported by the Fund of the China Agriculture Research System (CARS18-08) and the Science and Technology Support Program of Hebei Province (16226307D) to Z.M.; the National Major Science and Technology Program (2016ZX08005003-005) to X.W.; the National Key Research and Development Program (2016YFD0100203) to X.D., (2016YFD0101405) to Y.Z., and (2016YFD0100306) to S.H.; and the National Science and Technology Support Program (2013BAD01B03) to X.D.
The authors declare no competing financial interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Figures 1–23 and Supplementary Tables 3, 6, 8, 9, 11–13 and 15
The list of 419 cotton accessions used in this study and their sequenced information
Statistics of different SNP mutation types for 419 accessions
Tracy-Widom statistics of eigenvalues from PCA analysis of 419 accessions
The ancestry proportion estimates for each accession when the ancestral population was specified as three
Number of SNP variation of different genes between core collection and wild races
List of the associated SNPs and genes for 13 traits
SNPs, elite alleles and their frequency of 13 traits in wild races, early- and modern-varieties
About this article
Cite this article
Ma, Z., He, S., Wang, X. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat Genet 50, 803–813 (2018). https://doi.org/10.1038/s41588-018-0119-7
Industrial Crops and Products (2021)
Genomics and breeding innovations for enhancing genetic gain for climate resilience and nutrition traits
Theoretical and Applied Genetics (2021)
A combination of genome‐wide and transcriptome‐wide association studies reveals genetic elements leading to male sterility during high temperature stress in cotton
New Phytologist (2021)
Trends in Biotechnology (2021)
Identification of hub genes through co-expression network of major QTLs of fiber length and strength traits in multiple RIL populations of cotton