Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication

Abstract

Comparative population genomics offers an excellent opportunity for unraveling the genetic history of crop domestication. Upland cotton (Gossypium hirsutum) has long been an important economic crop, but a genome-wide and evolutionary understanding of the effects of human selection is lacking. Here, we describe a variation map for 352 wild and domesticated cotton accessions. We scanned 93 domestication sweeps occupying 74 Mb of the A subgenome and 104 Mb of the D subgenome, and identified 19 candidate loci for fiber-quality-related traits through a genome-wide association study. We provide evidence showing asymmetric subgenome domestication for directional selection of long fibers. Global analyses of DNase I–hypersensitive sites and 3D genome architecture, linking functional variants to gene transcription, demonstrate the effects of domestication on cis-regulatory divergence. This study provides new insights into the evolution of gene organization, regulation and adaptation in a major crop, and should serve as a rich resource for genome-based cotton improvement.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Geographic distribution and population diversity of Upland cotton accessions.
Figure 2: Genome-wide screening of domestication sweeps and GWAS on fiber-quality-related traits.
Figure 3: Asymmetric selection signals between the A subgenome (At) and the D subgenome (Dt).
Figure 4: Characterization of cotton DNase I–hypersensitive sites (DHSs) and detection of selected DHSs during domestication.
Figure 5: Characterization of the cotton chromatin interactome.

Similar content being viewed by others

Accession codes

Primary accessions

Sequence Read Archive

Referenced accessions

Sequence Read Archive

References

  1. Gross, B.L. & Olsen, K.M. Genetic perspectives on crop domestication. Trends Plant Sci. 15, 529–537 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Varshney, R.K., Terauchi, R. & McCouch, S.R. Harvesting the promising fruits of genomics: applying genome sequencing technologies to crop breeding. PLoS Biol. 12, e1001883 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Crossa, J. et al. Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity (Edinb) 112, 48–60 (2014).

    Article  CAS  Google Scholar 

  4. Chen, Z.J. et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 145, 1303–1310 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Senchina, D.S. et al. Rate variation among nuclear genes and the age of polyploidy in Gossypium. Mol. Biol. Evol. 20, 633–643 (2003).

    Article  CAS  PubMed  Google Scholar 

  6. Stewart, J.M., Oosterhuis, D., Heitholt, J.J. & Mauney, J.R. Physiology of Cotton (Springer, 2010).

  7. Rapp, R.A. et al. Gene expression in developing fibres of Upland cotton (Gossypium hirsutum L.) was massively altered by domestication. BMC Biol. 8, 139 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Yoo, M.J. & Wendel, J.F. Comparative evolutionary and developmental dynamics of the cotton (Gossypium hirsutum) fiber transcriptome. PLoS Genet. 10, e1004073 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).

    Article  CAS  PubMed  Google Scholar 

  10. Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).

    Article  CAS  PubMed  Google Scholar 

  11. Nie, X. et al. Genome-wide SSR-based association mapping for fiber quality in nation-wide upland cotton inbreed cultivars in China. BMC Genomics 17, 352 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Zhou, S.H. Genogram of Cotton Varieties in China (Sichuan Science and Technology Press, 2000).

  13. Huang, Z.K. Cotton Varieties and their Genealogy in China (Chinese Agricultural Press, 2007).

  14. Doebley, J.F., Gaut, B.S. & Smith, B.D. The molecular genetics of crop domestication. Cell 127, 1309–1321 (2006).

    Article  CAS  PubMed  Google Scholar 

  15. Hufford, M.B. et al. Comparative population genomics of maize domestication and improvement. Nat. Genet. 44, 808–811 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Huang, X. et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).

    Article  CAS  PubMed  Google Scholar 

  17. Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).

    Article  CAS  PubMed  Google Scholar 

  18. Lin, T. et al. Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 46, 1220–1226 (2014).

    Article  CAS  PubMed  Google Scholar 

  19. Said, J.I. et al. A comparative meta-analysis of QTL between intraspecific Gossypium hirsutum and interspecific G. hirsutum × G. barbadense populations. Mol. Genet. Genomics 290, 1003–1025 (2015).

    Article  CAS  PubMed  Google Scholar 

  20. Han, L.B. et al. The dual functions of WLIM1a in cell elongation and secondary wall formation in developing cotton fibers. Plant Cell 25, 4421–4438 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Applequist, W.L., Cronn, R. & Wendel, J.F. Comparative development of fiber in wild and cultivated cotton. Evol. Dev. 3, 3–17 (2001).

    Article  CAS  PubMed  Google Scholar 

  22. Hovav, R. et al. The evolution of spinnable cotton fiber entailed prolonged development and a novel metabolism. PLoS Genet. 4, e25 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Cheng, F. et al. Subgenome parallel selection is associated with morphotype diversification and convergent crop domestication in Brassica rapa and Brassica oleracea. Nat. Genet. 48, 1218–1224 (2016).

    Article  CAS  PubMed  Google Scholar 

  24. Banno, H. & Chua, N.H. Characterization of the Arabidopsis formin-like protein AFH1 and its interacting protein. Plant Cell Physiol. 41, 617–626 (2000).

    Article  CAS  PubMed  Google Scholar 

  25. Deeks, M.J., Hussey, P.J. & Davies, B. Formins: intermediates in signal-transduction cascades that affect cytoskeletal reorganization. Trends Plant Sci. 7, 492–498 (2002).

    Article  CAS  PubMed  Google Scholar 

  26. Bischoff, V. et al. TRICHOME BIREFRINGENCE and its homolog AT5G01360 encode plant-specific DUF231 proteins required for cellulose biosynthesis in Arabidopsis. Plant Physiol. 153, 590–602 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Brown, D.M., Zeef, L.A., Ellis, J., Goodacre, R. & Turner, S.R. Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17, 2281–2295 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Guo, K. et al. Fibre elongation requires normal redox homeostasis modulated by cytosolic ascorbate peroxidase in cotton (Gossypium hirsutum). J. Exp. Bot. 67, 3289–3301 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Feng, H. et al. Molecular analysis of proanthocyanidins related to pigmentation in brown cotton fibre (Gossypium hirsutum L.). J. Exp. Bot. 65, 5759–5769 (2014).

    Article  CAS  PubMed  Google Scholar 

  30. Xiao, Y.H. et al. Transcriptome and biochemical analyses revealed a detailed proanthocyanidin biosynthesis pathway in brown cotton fiber. PLoS One 9, e86344 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Maurano, M.T. et al. Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo. Nat. Genet. 47, 1393–1401 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wittkopp, P.J. & Kalay, G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet. 13, 59–69 (2011).

    Article  CAS  PubMed  Google Scholar 

  33. Burgess, D.G., Xu, J. & Freeling, M. Advances in understanding cis regulation of the plant gene with an emphasis on comparative genomics. Curr. Opin. Plant Biol. 27, 141–147 (2015).

    Article  CAS  PubMed  Google Scholar 

  34. Zhang, W. et al. High-resolution mapping of open chromatin in the rice genome. Genome Res. 22, 151–162 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Hobo, T., Kowyama, Y. & Hattori, T. A bZIP factor, TRAB1, interacts with VP1 and mediates abscisic acid-induced transcription. Proc. Natl. Acad. Sci. USA 96, 15348–15353 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wang, S. et al. Control of plant trichome development by a cotton fiber MYB gene. Plant Cell 16, 2323–2334 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Koini, M.A. et al. High temperature-mediated adaptations in plant architecture require the bHLH transcription factor PIF4. Curr. Biol. 19, 408–413 (2009).

    Article  CAS  PubMed  Google Scholar 

  38. Cook, P.R. The organization of replication and transcription. Science 284, 1790–1795 (1999).

    Article  CAS  PubMed  Google Scholar 

  39. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Fullwood, M.J. et al. An oestrogen-receptor-α-bound human chromatin interactome. Nature 462, 58–64 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Zhang, Y. et al. Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature 504, 306–310 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Villar, D. et al. Enhancer evolution across 20 mammalian species. Cell 160, 554–566 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Zhang, W., Zhang, T., Wu, Y. & Jiang, J. Genome-wide identification of regulatory DNA elements and protein-binding footprints using signatures of open chromatin in Arabidopsis. Plant Cell 24, 2719–2731 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wang, C. et al. Genome-wide analysis of local chromatin packing in Arabidopsis thaliana. Genome Res. 25, 246–256 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Zhou, Y. et al. Cotton (Gossypium hirsutum) 14-3-3 proteins participate in regulation of fibre initiation and elongation by modulating brassinosteroid signalling. Plant Biotechnol. J. 13, 269–280 (2015).

    Article  CAS  PubMed  Google Scholar 

  49. Jakoby, M.J. et al. Transcriptional profiling of mature Arabidopsis trichomes reveals that NOECK encodes the MIXTA-like transcriptional regulator MYB106. Plant Physiol. 148, 1583–1602 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Bueso, E. et al. ARABIDOPSIS THALIANA HOMEOBOX25 uncovers a role for gibberellins in seed longevity. Plant Physiol. 164, 999–1010 (2014).

    Article  CAS  PubMed  Google Scholar 

  51. He, X.C., Qin, Y.M., Xu, Y., Hu, C.Y. & Zhu, Y.X. Molecular cloning, expression profiling, and yeast complementation of 19 beta-tubulin cDNAs from developing cotton ovules. J. Exp. Bot. 59, 2687–2695 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Tan, J. et al. A genetic and metabolic analysis revealed that cotton fiber cell development was retarded by flavonoid naringenin. Plant Physiol. 162, 86–95 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Nakajima, K., Furutani, I., Tachimoto, H., Matsubara, H. & Hashimoto, T. SPIRAL1 encodes a plant-specific microtubule-localized protein required for directional control of rapidly expanding Arabidopsis cells. Plant Cell 16, 1178–1190 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Cheng, Q., Liu, H.T., Bombelli, P., Smith, A. & Slabas, A.R. Functional identification of AtFao3, a membrane bound long chain alcohol oxidase in Arabidopsis thaliana. FEBS Lett. 574, 62–68 (2004).

    Article  CAS  PubMed  Google Scholar 

  55. Szumlanski, A.L. & Nielsen, E. The Rab GTPase RabA4d regulates pollen tube tip growth in Arabidopsis thaliana. Plant Cell 21, 526–544 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Paterson, A.H., Brubaker, C.L. & Wendel, J.F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Report. 11, 122–127 (1993).

    Article  CAS  Google Scholar 

  57. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  60. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).

    CAS  Google Scholar 

  61. Chen, K. et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat. Methods 6, 677–681 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Zhuang, J. & Weng, Z. Local sequence assembly reveals a high-resolution profile of somatic structural variations in 97 cancer genomes. Nucleic Acids Res. 43, 8146–8156 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Felsenstein, J. PHYLIP-phylogeny inference package (version 3.2). Cladistics 5, 164–166 (1989).

    Google Scholar 

  65. Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    Article  CAS  PubMed  Google Scholar 

  66. Falush, D., Stephens, M. & Pritchard, J.K. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Chen, H., Patterson, N. & Reich, D. Population differentiation as a test for selective sweeps. Genome Res. 20, 393–402 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Bradbury, P.J. et al. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635 (2007).

    Article  CAS  PubMed  Google Scholar 

  70. Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Liu, D., Zhang, X., Tu, L., Zhu, L. & Guo, X. Isolation by suppression-subtractive hybridization of genes preferentially expressed during early and late fiber development stages in cotton. Mol. Biol. (Mosk.) 40, 825–834 (2006).

    CAS  Google Scholar 

  72. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Bolger, A.M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Krueger, F. & Andrews, S.R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Zhang, W. & Jiang, J. Genome-wide mapping of DNase I hypersensitive sites in plants. Methods Mol. Biol. 1284, 71–89 (2015).

    Article  CAS  PubMed  Google Scholar 

  77. Wang, M. et al. Multi-omics maps of cotton fibre reveal epigenetic basis for staged single-cell differentiation. Nucleic Acids Res. 44, 4067–4079 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Boyle, A.P., Guinney, J., Crawford, G.E. & Furey, T.S. F-Seq: a feature density estimator for high-throughput sequence tags. Bioinformatics 24, 2537–2538 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X.S. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).

    Article  CAS  PubMed  Google Scholar 

  81. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Xie, T. et al. De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana. Mol. Plant 8, 489–492 (2015).

    Article  CAS  PubMed  Google Scholar 

  84. Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Ay, F., Bailey, T.L. & Noble, W.S. Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Res. 24, 999–1011 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Shin, H. et al. TopDom: an efficient and deterministic method for identifying topological domains in genomes. Nucleic Acids Res. 44, e70 (2016).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank T. Zhang (Nanjing Agricultural University) for releasing resequencing data of wild cotton accessions. This work was supported by funding from the National Natural Science Foundation of China (31230056) to X.Z. and the National Natural Science Foundation of China (31201251) to D.Y.

Author information

Authors and Affiliations

Authors

Contributions

X. Zhang, L.T. and M.W. conceived and designed the project. P.W., M.L., Q.Y., Z.Y., X. Zhou, M.W. and X.N. performed the experiments. M.W., P.W. and Q.Z. developed libraries and performed sequencing. M.W., C.S., J.L., L. Zhang, K.G., Y.M., Z. Li, C.H. and D.Y. analyzed the data. Z. Lin, L.T., S.J., L. Zhu, X.Y. and L.M. collected materials and managed sequencing. M.W. wrote the manuscript draft, which was revised by K.L. and X. Zhang.

Corresponding authors

Correspondence to Keith Lindsey or Xianlong Zhang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Lengths of small insertions and deletions in different genomic regions.

All identified indels are categorized into intergenic, intron and exon regions. For each group, the percentage of insertions or deletions with lengths below 10 bp is compared with the total indels.

Supplementary Figure 2 Population structure analysis with Structure.

Analysis with Structure1. Each color represents one subpopulation. Each vertical bar represents one cotton accession. When K=2, Chinese cottons were separated. When K=3, cottons from America, Brazil, and India were separated from wild cotton accessions.

Supplementary Figure 3 Genetic diversity and population divergence at the subgenomic level among three cotton groups.

(a) Genetic diversity and population divergence in the At subgenome. (b) Genetic diversity and population divergence in the Dt subgenome. For each group, nucleotide diversity (π) is shown inside the circle. Population divergence (FST) between two groups is shown on each line.

Supplementary Figure 4 Decay of linkage disequilibrium (LD) in the At and Dt subgenomes.

(a) Decay of LD for the At subgenome in each group. (b) Decay of LD for the Dt subgenome in each group. LD was calculated in 1 Mb distances. In the Wild group, the LD extent was estimated to be 92 kb (r2 = 0.16) in the At and 64 kb (r2 = 0.15) in the Dt. In the ABI group, the LD extent was estimated to be 214 kb (r2 = 0.21) in the At and 138 kb (r2 = 0.24) in the Dt. In the Chinese group, the LD extent was estimated to be 310 kb (r2 = 0.24) in the At and 270 kb (r2 = 0.25) in the Dt.

Supplementary Figure 5 Heat map showing genes differentially expressed between wild and domesticated cotton accessions.

Expression levels of genes between five cultivated (TM-1, Maxxa, CascotL-7, CRB252 and Coker315) and four wild (TX2090, TX2094, TX2095 and TX665)2 cotton accessions were compared. A total of 30 genes under domestication sweeps are shown, and which are known to be important for fiber development.

Supplementary Figure 6 Asymmetric subgenome selection signals in each ancestral state.

Each ancestral state was reconstructed using homoeologous gene pairs between the At and Dt. The upper track shows selection signals in the At and the lower track shows selection signals in the Dt. Some important genes with selection signals are indicated in red, and the expression levels of these genes are shown in Supplementary Table 14. The horizontal dashed line shows the cutoff of 4.8 (πwc). The ancestral state 3 is shown in Fig. 3c.

Supplementary Figure 7 RNA-seq analysis of 50 cotton accessions.

(a) Clustering of the wild and cultivated cottons. Cotton accessions are represented by dots in four different colors respectively. The wild and cultivated accessions are each grouped with grey circles. (b) Expression breadth of genes. High Jensen–Shannon (JS) scores3 indicate that genes are highly expressed in one or a few accessions, indicating that these genes exhibit wide variation in expression level. This analysis shows that genes in wild cottons exhibit a wider variation in expression than do cultivated cottons. Abbreviations representing cottons from different cultivation regions in China were the same as those in Fig. 1c.

Supplementary Figure 8 DNase I digestion of cotton nuclei.

M, marker. 1, 4°C treatment without digestion. 2, 37°C treatment without digestion. 3-10, DNase I digestion at 37°C with different enzyme dose at 0.5U, 1U, 2U, 3U, 4U, 8U, 10U, 16U, respectively. The digestion time is 10 min.

Supplementary Figure 9 Chromosomal landscape of DNase I–hypersensitive sites (DHSs) and chromatin-modification marks.

(a) TE content. Regions with high TE content are represented by the dark purple bands. (b) Gene density. (c) Enrichment for H3K4me1 modification. (d) Enrichment for H3K4me3 modification. (e) Enrichment for H3K27me3 modification. (f) Enrichment for H3K9me2 modification. (g) Number of DHSs in cotton leaves. (h) Number of DHSs in cotton fibers. For tracks b-h, high column bars show high enrichment of chromatin modification marks or large numbers of DHSs in chromosomal regions. For the circos plot, each chromosome was divided into 1 Mb windows sliding 200 kb.

Supplementary Figure 10 Patterns of chromatin-modification marks in genic and TE regions.

(a) Enrichment of chromatin modification marks in genic regions. (b) Enrichment of chromatin modification marks in short TEs (<500 bp). (c) Enrichment of chromatin modification marks in long TEs (>4 kb). The chromatin modification level was normalized by Input DNA sequencing data. For each analysis, the upstream and downstream 2 kb sequences were divided into 100 bins of 20 bp. Gene and TE bodies were divided into 100 bins of equal lengths.

Supplementary Figure 11 Clustering and ordering contigs of G. hirsutum with LACHESIS.

(a) The results of clustering of simulated contigs in the At subgenome of G. hirsutum. (b) The results of clustering of simulated contigs in the Dt subgenome of G. hirsutum. In this analysis, we split the TM-1 genome into 100 kb simulated contigs and mapped Hi-C clean reads to them. The LACHESIS software4 was used to cluster and order these contigs. The derived contig groups were compared with chromosome assemblies in the reference genome of TM-15. Discrete dots show putative genome assembly errors.

Supplementary Figure 12 Global chromatin interaction in the At and Dt subgenomes.

Chromatin interaction for the At subgenome is indicated in the upper right triangular matrix. Chromatin interaction for the Dt subgenome is indicated in the lower left triangular matrix. The chromatin interaction maps are visualized at a 200-kb resolution. Strong contact is represented in red and weak contact in white.

Supplementary Figure 13 Patterns of chromatin-modification marks in topologically associated domain–like (TAD-like) and boundary-like regions.

(a) Enrichment of chromatin modification marks in TAD-like regions. (b) Enrichment of chromatin modification marks in boundary-like regions. For each modification mark, the enrichment level was normalized by Input DNA sequencing data.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–13 and Supplementary Tables 1–6, 8, 11, 17, 18, 22 and 23 (PDF 3191 kb)

Supplementary Table 7

Identification of domestication sweeps and genes (XLSX 52 kb)

Supplementary Table 9

Summary of genes in QTL hotspot regions with selection signals (XLSX 41 kb)

Supplementary Table 10

Fiber quality-related traits used for GWAS (XLSX 75 kb)

Supplementary Table 12

Expression and annotation of candidate genes identified by GWAS (XLSX 18 kb)

Supplementary Table 13

Summary of homoeologous gene pairs with selection signals in at least one subgenome (XLSX 48 kb)

Supplementary Table 14

Summary of genes and expression with asymmetric subgenome selection signals (XLSX 22 kb)

Supplementary Table 15

Summary of promoter DHSs with domestication signals (XLSX 77 kb)

Supplementary Table 16

Summary of transcription factor binding motifs identified in TM-1 genome (XLSX 10608 kb)

Supplementary Table 19

Identification of topological domain-like and boundary-like regions in TM-1 genome (XLSX 99 kb)

Supplementary Table 20

Summary of promoter-centered chromatin interactions in TM-1 accession (XLSX 8214 kb)

Supplementary Table 21

Summary of enhancers under domestication selection (XLSX 98 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Tu, L., Lin, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat Genet 49, 579–587 (2017). https://doi.org/10.1038/ng.3807

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3807

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing