Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres

Journal name:
Nature
Volume:
492,
Pages:
423–427
Date published:
DOI:
doi:10.1038/nature11798
Received
Accepted
Published online

Polyploidy often confers emergent properties, such as the higher fibre productivity and quality of tetraploid cottons than diploid cottons bred for the same environments1. Here we show that an abrupt five- to sixfold ploidy increase approximately 60million years (Myr) ago, and allopolyploidy reuniting divergent Gossypium genomes approximately 1–2 Myr ago2, conferred about 30–36-fold duplication of ancestral angiosperm (flowering plant) genes in elite cottons (Gossypium hirsutum and Gossypium barbadense), genetic complexity equalled only by Brassica3 among sequenced angiosperms. Nascent fibre evolution, before allopolyploidy, is elucidated by comparison of spinnable-fibred Gossypium herbaceum A and non-spinnable Gossypium longicalyx F genomes to one another and the outgroup D genome of non-spinnable Gossypium raimondii. The sequence of a G. hirsutum AtDt (in which ‘t’ indicates tetraploid) cultivar reveals many non-reciprocal DNA exchanges between subgenomes that may have contributed to phenotypic innovation and/or other emergent properties such as ecological adaptation by polyploids. Most DNA-level novelty in G. hirsutum recombines alleles from the D-genome progenitor native to its New World habitat and the Old World A-genome progenitor in which spinnable fibre evolved. Coordinated expression changes in proximal groups of functionally distinct genes, including a nuclear mitochondrial DNA block, may account for clusters of cotton-fibre quantitative trait loci affecting diverse traits. Opportunities abound for dissecting emergent properties of other polyploids, particularly angiosperms, by comparison to diploid progenitors and outgroups.

At a glance

Figures

  1. Evolution of spinnable cotton fibres.
    Figure 1: Evolution of spinnable cotton fibres.

    Paleohexaploidy in a eudicot ancestor (red, yellow and blue lines) formed a genome resembling that of grape (bottom right). Shortly after divergence from cacao (bottom left), the Gossypium lineage experienced a five- to sixfold ploidy increase. Spinnable fibre evolved in the A genome after its divergence from the F genome, and was further elaborated after the merger of A and D genomes ~1–2Myr ago, forming the common ancestor of G. hirsutum (Upland) and G. barbadense (Egyptian, Sea Island and Pima) cottons.

  2. Syntenic relationships among grape, cacao and cotton.
    Figure 2: Syntenic relationships among grape, cacao and cotton.

    a, Macro-synteny connecting blocks of >30 genes (grey lines). Highlighted regions (pink and red) trace to a common ancestor before the pan-eudicot hexaploidy7, with the Gossypium lineage five- to sixfold ploidy increase forming multiple derived regions. Inferred duplication depth in cotton varies (top). b, Micro-synteny of grape chromosome (Chr) 3, cacao chromosome 2 and five cotton chromosomes. Rectangles represent predicted genes, with connecting grey lines showing co-linear relationships. An example (1 grape, 1 cocoa, 5 cotton) is highlighted in red.

  3. Paleo-evolution of cotton gene families.
    Figure 3: Paleo-evolution of cotton gene families.

    a, Myb subgroup 9 (ref. 12) originated from a gene on the progenitor of cacao chromosome 2 that formed two adjacent copies after Malvales–Brassicales divergence and then triplicated in cotton, with subsequent loss of one chromosome 8 and two chromosome 12 paralogues. One extant paralogue traces to pan-eudicot hexaploidy, Tc04 g009420, and reduplicated in cotton (Gorai.012G052500.1 and Gorai.011G122800.1) and Arabidopsis8 (At3g01140 and At5g15310). The other, Tc01 g036330, has reduplicated in cotton (Gorai.004G157600.1 and Gorai.001G169700.1). Asterisk indicates increased gene expression in elite versus wild tetraploids (Supplementary Table 5.3). b, The most NBS-rich region of T. cacao, on chromosome 7, corresponds to regions of G. raimondii chromosome triplets 2/10/13 and 7/9/4. Cacao chromosome 7 NBSs form a single branch, indicating lineage-specific expansion. G. raimondii chromosome 7 and 13 NBSs form distinct branches, indicating cluster/tandem duplication (gene numbers also reflect physical proximity of genes to one another).

  4. Allelic changes between A- and D-genome diploid progenitors and the At and Dt subgenomes of G. hirsutum cultivar Acala Maxxa.
    Figure 4: Allelic changes between A- and D-genome diploid progenitors and the At and Dt subgenomes of G. hirsutum cultivar Acala Maxxa.

Accession codes

References

  1. Jiang, C., Wright, R. J., El-Zik, K. M. & Paterson, A. H. Polyploid formation created unique avenues for response to selection in Gossypium (cotton). Proc. Natl Acad. Sci. USA 95, 44194424 (1998)
  2. Wendel, J. F. New world tetraploid cottons contain old-world cytoplasm. Proc. Natl Acad. Sci. USA 86, 41324136 (1989)
  3. Wang, X. et al. The genome of the mesopolyploid crop species Brassica rapa. Nature Genet. 43, 10351139 (2011)
  4. Senchina, D. S. et al. Rate variation among nuclear genes and the age of polyploidy in Gossypium. Mol. Biol. Evol. 20, 633643 (2003)
  5. Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nature Genet. 44, 10981103 (2012)
  6. Carvalho, M. R., Herrera, F. A., Jaramillo, C. A., Wing, S. L. & Callejas, R. Paleocene Malvaceae from northern South America and their biogeographical implications. Am. J. Bot. 98, 13371355 (2011)
  7. Jaillon, O. et al. The French–Italian Public Consortium for Grapevine Genome Characterization. The grapevine genome sequence suggests ancestral hexaploidization in the major angiosperm phyla. Nature 449, 463467 (2007)
  8. Bowers, J. E., Chapman, B. A., Rong, J. & Paterson, A. H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433438 (2003)
  9. Muravenko, O. V. et al. Comparison of chromosome BrdU–Hoechst–Giemsa banding patterns of the A1 and (AD)2 genomes of cotton. Genome 41, 616625 (1998)
  10. Jiao, Y. et al. A genome triplication associated with early diversification of the core eudicots. Genome Biol. 13, R3 (2012)
  11. Fawcett, J. A., Maere, S. & Van de Peer, Y. Plants with double genomes might have had a better chance to survive the Cretaceous–Tertiary extinction event. Proc. Natl Acad. Sci. USA 106, 57375742 (2009)
  12. Stracke, R., Werber, M. & Weisshaar, B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 4, 447456 (2001)
  13. Adkisson, P. L., Niles, G. A., Walker, J. K., Bird, L. S. & Scott, H. B. Controlling Cotton’s insect pests: a new system. Science 216, 1922 (1982)
  14. Wang, X., Tang, H. & Paterson, A. H. Seventy million years of concerted evolution of a homoeologous chromosome pair, in parallel in major Poaceae lineages. Plant Cell 23, 2737 (2011)
  15. Haigler, C. H., Betancur, L., Stiff, M. R. & Tuttle, J. R. Cotton fiber: a powerful single-cell model for cell wall and cellulose research. Front. Plant Sci. 3, 17 (2012)
  16. Doblin, M. S., Pettolino, F. & Bacic, A. Plant cell walls: the skeleton of the plant world. Funct. Plant Biol. 37, 357381 (2010)
  17. Baulcombe, D. RNA silencing in plants. Nature 431, 356363 (2004)
  18. Matzke, M. A. & Birchler, J. A. RNAi-mediated pathways in the nucleus. Nature Rev. Genet. 6, 2435 (2005)
  19. Brodersen, P. & Voinnet, O. The diversity of RNA silencing pathways in plants. Trends Genet. 22, 268280 (2006)
  20. Kim, H. J. & Triplett, B. A. Cotton fiber germin-like protein. I. Molecular cloning and gene expression. Planta 218, 516524 (2004)
  21. Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011)
  22. Rong, J. et al. Meta-analysis of polyploid cotton QTL shows unequal contributions of subgenomes to a complex network of genes and gene clusters implicated in lint fiber development. Genetics 176, 25772588 (2007)
  23. Wang, G. L., Dong, J. M. & Paterson, A. H. The distribution of Gossypium hirsutum chromatin in G. barbadense germ plasm: molecular analysis of introgressive plant-breeding. Theor. Appl. Genet. 91, 11531161 (1995)
  24. Richly, E. & Leister, D. NUMTs in sequenced eukaryotic genomes. Mol. Biol. Evol. 21, 10811084 (2004)
  25. Reinisch, A. J. et al. A detailed RFLP map of cotton (Gossypium hirsutum × Gossypium barbadense): chromosome organization and evolution in a disomic polyploid genome. Genetics 138, 829847 (1994)
  26. Flagel, L. E., Wendel, J. F. & Udall, J. A. Duplicate gene evolution, homoeologous recombination, and transcriptome characterization in allopolyploid cotton. BMC Genomics 13, 302 (2012)
  27. Wendel, J. F., Schnabel, A. & Seelanan, T. An unusual ribosomal DNA sequence from Gossypium gossypioides reveals ancient, cryptic, intergenomic introgression. Mol. Phylogenet. Evol. 4, 298313 (1995)
  28. Zhao, X. P. et al. Dispersed repetitive DNA has spread to new genomes since polyploid formation in cotton. Genome Res. 8, 479492 (1998)
  29. Cronn, R., Small, R. L., Haselkorn, T. & Wendel, J. F. Cryptic repeated genomic recombination during speciation in Gossypium gossypioides. Evolution 57, 24752489 (2003)
  30. Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 56545666 (2003)
  31. Jaffe, D. B. et al. Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res. 13, 9196 (2003)
  32. Lin, L. et al. A draft physical map of a D-genome cotton species (Gossypium raimondii). BMC Genomics 11, 389417 (2010)
  33. Rong, J.-K. et al. A 3347-locus genetic recombination map of sequence-tagged sites reveals features of genome organization, transmission and evolution of cotton (Gossypium). Genetics 166, 389417 (2004)
  34. Rong, J. et al. Comparative genomics of Gossypium and Arabidopsis: unraveling the consequences of both ancient and recent polyploidy. Genome Res. 15, 11981210 (2005)
  35. Kent, W. J. BLAT–the BLAST-like alignment tool. Genome Res. 4, 656664 (2002)
  36. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403410 (1990)
  37. Hendrix, B. & Stewart, J. M. Estimation of the nuclear DNA content of Gossypium species. Ann. Bot. 95, 789797 (2005)
  38. Kadir, Z. B. A. DNA evolution in the genus Gossypium. Chromosoma 56, 8594 (1976)
  39. Geever, R., Katterman, F. & Endrizzi, J. DNA hybridization analyses of Gossypium allotetraploid and two closely related diploid species. Theor. Appl. Genet. 77, 553559 (1989)
  40. Walbot, V. & Dure, L. S. Developmental biochemistry of cotton seed embryogenesis and germination. VII. Characterization of cotton genome. J. Mol. Biol. 101, 503536 (1976)
  41. Salamov, A. A. & Solovyev, V. V. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10, 516522 (2000)
  42. Yeh, R.-F., Lim, L. P. & Burge, C. Computational inference of homologous gene structures in the human genome. Genome Res. 11, 803816 (2001)

Download references

Author information

Affiliations

  1. Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia 30602, USA

    • Andrew H. Paterson,
    • Hui Guo,
    • Tae-ho Lee,
    • Jingping Li,
    • Lifeng Lin,
    • Barry S. Marler,
    • Xu Tan,
    • Haibao Tang,
    • Zining Wang,
    • Dong Zhang,
    • John E. Bowers,
    • Sayan Das,
    • Alan R. Gingle,
    • Cornelia Lemke,
    • Shahid Mansoor,
    • Lisa N. Rainville,
    • Jun-kang Rong &
    • Xiyin Wang
  2. Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, Iowa 50011, USA

    • Jonathan F. Wendel,
    • Mi-jeong Yoo,
    • Lei Gong,
    • Corrinne Grover,
    • Kara Grupp,
    • Guanjing Hu,
    • Emmanuel Szadkowski &
    • Chunming Xu
  3. MIPS/IBIS Institute for Bioinformatics and System Biology, German Research Center for Environmental Health (GmbH), 85764 Neuherberg, Germany

    • Heidrun Gundlach &
    • Klaus F. X. Mayer
  4. Department of Energy Joint Genome Institute, Walnut Creek, California 94595, USA

    • Jerry Jenkins,
    • Shengqiang Shu,
    • Jane Grimwood,
    • Daniel S. Rokhsar &
    • Jeremy Schmutz
  5. HudsonAlpha Institute of Biotechnology, Huntsville, Alabama 35806, USA

    • Jerry Jenkins,
    • Jane Grimwood &
    • Jeremy Schmutz
  6. Center for Genomics and Computational Biology, School of Life Sciences, and School of Sciences, Hebei United University, Tangshan, Hebei 063000, China

    • Dianchuan Jin,
    • Wei Chen,
    • Tao Liu,
    • Jinpeng Wang,
    • Lan Zhang &
    • Xiyin Wang
  7. CSIRO Plant Industry, Canberra, ACT 2601, Australia

    • Danny Llewellyn,
    • Frank Bedon,
    • Curt L. Brubaker,
    • Sally A. Walford &
    • Elizabeth S. Dennis
  8. Institute for Genomics, Biocomputing & Biotechnology, Mississippi State University, Mississippi State, Mississippi 39762, USA

    • Kurtis C. Showmaker,
    • William S. Sanders &
    • Daniel G. Peterson
  9. Plant and Wildlife Science Department, Brigham Young University, Provo, Utah 84602, USA

    • Joshua Udall,
    • Robert Byers,
    • Justin T. Page,
    • David Harker &
    • Aditi Rambani
  10. Department of Field Crops, Plant Sciences Institute, ARO, Bet-Dagan 50250, Israel

    • Adi Doron-Faigenboim &
    • Ran Hovav
  11. Jamie Whitten Delta States Research Center, USDA-ARS, Stoneville, Mississippi 38776, USA

    • Mary V. Duke,
    • Brian E. Scheffler &
    • Jodi A. Scheffler
  12. Department of Biological Sciences, University of Rhode Island, Kingston, Rhode Island 02881, USA

    • Alison W. Roberts
  13. Departamento de Genética, Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, 21941-901, Brazil

    • Elisson Romanel
  14. J. Craig Venter Institute, Rockville, Maryland 20850, USA

    • Haibao Tang
  15. Key Laboratory of Molecular Epigenetics of MOE, Unit of Plant Epigenetics, Institute of Genetics & Cytology, Northeast Normal University, Renmin Street, 5268 Changchun, China

    • Chunming Xu
  16. Plant Reproductive Biology Extension Center, University of California, Davis, California 95616, USA

    • Hamid Ashrafi &
    • Allen Van Deynze
  17. Bayer CropScience, Technologiepark 38, 9052 Gent, Belgium

    • Curt L. Brubaker
  18. Coastal Plain Experiment Station, University of Georgia, Tifton, Georgia 31793, USA

    • Peng W. Chee
  19. Departments of Crop Science and Plant Biology, North Carolina State University, Raleigh, North Carolina 27695, USA

    • Candace H. Haigler
  20. Centro Nacional de Pesquisa em Algodão, EMBRAPA, Santo Antônio de Góias, GO 75375-000, Brazil

    • Lucia V. Hoffmann
  21. Cotton Incorporated, Cary, North Carolina 27513, USA

    • Donald C. Jones
  22. National Institute for Biotechnology & Genetic Engineering, Faisalabad 38000, Pakistan

    • Shahid Mansoor &
    • Mehboob ur Rahman
  23. Department of Biology, West Virginia State University, Institute, West Virginia 25112, USA

    • Umesh K. Reddy
  24. Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture, The Hebrew University of Jerusalem, Rehovot 76100, Israel

    • Yehoshua Saranga
  25. Department of Soil and Crop Science, Texas A&M University, College Station, Texas 77843, USA

    • David M. Stelly
  26. Cotton Fiber Bioscience Research, USDA-ARS, New Orleans, Louisiana 70124, USA

    • Barbara A. Triplett
  27. Departamento de Microbiologia, Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-971, Brazil

    • Maite F. S. Vaslin
  28. Central Institute for Cotton Research, Nagpur, 440010 Maharashtra, India

    • Vijay N. Waghmare
  29. Department of Plant Sciences, Texas Tech University, Lubbock, Texas 79415, USA

    • Robert J. Wright
  30. Nucleic Acids Department, Genetic Engineering & Biotechnology Research Institute, 21934 Alexandria, Egypt

    • Essam A. Zaki
  31. Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095 Jiangsu, China

    • Tianzhen Zhang

Contributions

A.H.P., D.S.R., J.S. and D.G.P. conceived the study. J.S., J.G., D.S.R., K.C.S., S.D., M.V.D., C.L., L.N.R., B.E.S. and J.A.S. performed sequencing and associated clone manipulations. A.H.P., J.F.W., D.L., E.S.D., J.U., E.R., Z.W., H.A., L.V.H., R.H., D.M.S., A.V.-D. and T.Z. contributed unpublished data. A.H.P., J.F.W., H.Guo, H.Gundlach, J.J., D.J., D.L., S.S., J.U., M.-j.Y., R.B., W.C., A.D.-F., L.G., C.G., K.G., G.H., T.-h.L., J.L., L.L., T.L., B.S.M., J.T.P., A.W.R., E.R., E.S., X.T., H.T., C.X., J.W., Z.W., D.Z., L.Z., F.B., C.H.H., D.H., L.V.H., R.H., S.M., M.F.S.V., S.A.W., T.Z., E.S.D., D.S.R., X.W. and J.S. analysed data. A.H.P., J.F.W., D.L., E.S.D., K.F.X.M., D.G.P. and J.S. wrote the manuscript. All authors discussed results and commented on the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Sequences have been deposited in NCBI for G. raimondii (BioProject accession PRJNA171262), G. longicalyx (accession F1-1, SRA061660), G. herbaceum (accession A1-97, SRA061243) and G. hirsutum (cultivar Acala Maxxa, SRS375727) genomes; G. hirsutum (SRA061240) and G. barbadense (SRA061309) fibre transcriptomes; G. hirsutum (SRA061456) seed transcriptomes; and G. hirsutum microRNAs (SRA061415).

Author details

Supplementary information

PDF files

  1. Supplementary Information (17.1M)

    This file contains Supplementary Text, Supplementary Tables, Supplementary Figures and Supplementary References – see contents for details. This file was replaced on 7 February 2013 to replace Supplementary Figure S3.10.

  2. Supplementary Information (371K)

    This file contains Supplementary Figure 3.8 - Phylogenetic relationships and clade designation in R2R3-MYB proteins from G. raimondii and A.thaliana (see Supplementary Information page 33 for full legend). This file was added on 7 February 2013.

  3. Supplementary Information (2.1M)

    This file contains Supplementary Figure S3.9 - Phylogenetic analysis of the R2R3-MYBs belonging to the MIXTA clade (subgroup 9) from sequenced plant genomes, including Gossypium raimondii (see Supplementary Information page 33 for full legend). This file was added on 7 February 2013.

Zip files

  1. Supplementary Data (237K)

    This zipped file contains Supplementary Tables S3.5, S4.6, S4.7a and S5.3 – see Supplementary Information pdf for details.

Additional data