The pineapple genome and the evolution of CAM photosynthesis

Journal name:
Nature Genetics
Volume:
47,
Pages:
1435–1442
Year published:
DOI:
doi:10.1038/ng.3435
Received
Accepted
Published online

Abstract

Pineapple (Ananas comosus (L.) Merr.) is the most economically valuable crop possessing crassulacean acid metabolism (CAM), a photosynthetic carbon assimilation pathway with high water-use efficiency, and the second most important tropical fruit. We sequenced the genomes of pineapple varieties F153 and MD2 and a wild pineapple relative, Ananas bracteatus accession CB5. The pineapple genome has one fewer ancient whole-genome duplication event than sequenced grass genomes and a conserved karyotype with seven chromosomes from before the ρ duplication event. The pineapple lineage has transitioned from C3 photosynthesis to CAM, with CAM-related genes exhibiting a diel expression pattern in photosynthetic tissues. CAM pathway genes were enriched with cis-regulatory elements associated with the regulation of circadian clock genes, providing the first cis-regulatory link between CAM and circadian clock regulation. Pineapple CAM photosynthesis evolved by the reconfiguration of pathways in C3 plants, through the regulatory neofunctionalization of preexisting genes and not through the acquisition of neofunctionalized genes via whole-genome or tandem gene duplication.

At a glance

Figures

  1. Phylogenetic analysis of the pineapple LTR retrotransposon sequences encoding the reverse-transcriptase domain.
    Figure 1: Phylogenetic analysis of the pineapple LTR retrotransposon sequences encoding the reverse-transcriptase domain.

    The unrooted phylogenetic tree of Gypsy and Copia elements was constructed on the basis of 6,379 aligned sequences corresponding to the reverse-transcriptase domain.

  2. Karyotype evolution in the monocots.
    Figure 2: Karyotype evolution in the monocots.

    Shown are the 25 pineapple chromosomes organized into the pairs of paired chromosomes that arose after two WGD events. Each color represents one of the seven ancestral chromosomes. The left and right pairs represent the two subgenomes produced by WGD τ, and within each pair are the two subgenomes produced by WGD σ.

  3. Genome evolution in pineapple.
    Figure 3: Genome evolution in pineapple.

    (a) Dating of WGD events on the monocot tree of life. Circles represent known WGD events identified previously. The pineapple genome sequence clarified the dating of the three WGD events in the grass lineage: ρ, σ and τ. Taxon labels are colored according to photosynthetic metabolism: C3, C4 or CAM. (b) Genomic alignment for Amborella trichopoda, A. comosus (pineapple) and Oryza sativa (rice), tracking gene positions through multiple species and copy numbers arising from multiple genome duplication events. Macrosynteny patterns show that a typical ancestral region in the basal angiosperm Amborella can be tracked to up to four regions in pineapple owing to the two genome duplication events, σ and τ, and to up to eight regions in rice. Gray wedges in the background highlight major syntenic blocks spanning more than 30 genes between the genomes (highlighted by one syntenic set shown in red). (c) Microcollinearity patterns between genomic regions from A. trichopoda, A. comosus (pineapple) and O. sativa (rice). Rectangles represent predicted gene models, with blue and green showing relative gene orientation. Gray wedges connect matching gene pairs, with one set highlighted in red.

  4. Evolution of the CAM pathway in pineapple.
    Figure 4: Evolution of the CAM pathway in pineapple.

    (a) Pineapple leaf tissue used to survey the diurnal expression patterns of CAM pathway genes. The fully expanded D leaf of field-grown pineapple is shown. Green (photosynthetic) tissue at the leaf tip and white (non-photosynthetic) tissue at the leaf base were collected to distinguish CAM-related gene expression from non-CAM-related circadian oscillation. (b) Overview of the carboxylation (top) and decarboxylation (bottom) pathways of CAM. CAM enzymes are shown in blue. (c) Expression pattern and cis-regulatory elements of pineapple carbon fixation genes across the diurnal expression data. log2-transformed fragments mapped per kilobase of transcript length per million total mapped reads (FPKM) expression profiles are shown. Four known circadian clock–related binding motif sequences were searched in the 1-kb region upstream of each gene. (d) Summary table of the number of putative carbon fixation genes in pineapple, orchid, rice and maize. (e) Gene regulatory network of green leaf tissue. Only the largest module of the network was kept. Genes related to CAM and their interaction partners are highlighted in yellow.

  5. Correlation between family copy number and expression level of LTR elements.
    Supplementary Fig. 1: Correlation between family copy number and expression level of LTR elements.

    The data indicate that high expression levels of LTR elements are correlated with a relatively low copy number of their family.

  6. Expression of intact LTR retrotransposons in nine pineapple tissue samples.
    Supplementary Fig. 2: Expression of intact LTR retrotransposons in nine pineapple tissue samples.

    This heat map shows the number of RNA-seq reads mapped to the top 40 most highly expressed LTR retrotransposon families. Family names are shown as row labels, and tissue names are given as column labels. From top to bottom, the rows are sorted by total counts of mapped reads in families.

  7. Expression of subfamilies of LTR retrotransposons in nine pineapple tissue samples.
    Supplementary Fig. 3: Expression of subfamilies of LTR retrotransposons in nine pineapple tissue samples.

    The heat map shows the number of RNA-seq reads mapped to the top ten most highly expressed LTR retrotransposon families. Each row represents a subfamily, and each column represents a tissue. The numbers following family names give subfamily IDs. From top to bottom, the rows are sorted by total counts of mapped reads in families. Within each family, the rows are further sorted by total counts of mapped reads in subfamilies.

  8. Synonymous substitutions per site (Ks) values between inferred whole-genome duplicates in pineapple.
    Supplementary Fig. 4: Synonymous substitutions per site (Ks) values between inferred whole-genome duplicates in pineapple.

    (a) Syntenic dot plot in pineapple versus pineapple comparison, with Ks values color coded; only the gene pairs with a Ks value between 0 and 2 are plotted. (b) Histogram of Ks values for pineapple-rice orthologs, rice whole-genome duplicates and pineapple whole-genome duplicates.

  9. Pairwise genome comparisons between pineapple and ten related plant species.
    Supplementary Fig. 5: Pairwise genome comparisons between pineapple and ten related plant species.

    Pairwise comparisons (dot plots) between pineapple (y axis) and a total of ten related plant genomes (x axis), including (aj) Amborella, banana, date palm, duckweed, grape, oil palm, orchid, pineapple (i.e., self-comparison), rice and sorghum. For clarity, only gene pairs within synteny blocks of at least size 4 are shown.

  10. Microsynteny fractionation for 4:1 pineapple to Amborella, providing evidence that pineapple has undergone two WGDs in its lineage since their divergence.
    Supplementary Fig. 6: Microsynteny fractionation for 4:1 pineapple to Amborella, providing evidence that pineapple has undergone two WGDs in its lineage since their divergence.

    Five exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in rice and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e426, https://genomevolution.org/r/e428, https://genomevolution.org/r/e427, https://genomevolution.org/r/e448 and https://genomevolution.org/r/e446.

  11. Microsynteny fractionation for 1:2 pineapple to rice, providing evidence that rice has undergone one WGD in its lineage ([rho]) since its divergence from pineapple.
    Supplementary Fig. 7: Microsynteny fractionation for 1:2 pineapple to rice, providing evidence that rice has undergone one WGD in its lineage (ρ) since its divergence from pineapple.

    Three exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in Amborella and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e3kg, https://genomevolution.org/r/e3kw, https://genomevolution.org/r/e3k4.

  12. Dating of whole-genome duplication (WGD) events on the flowering plant tree.
    Supplementary Fig. 8: Dating of whole-genome duplication (WGD) events on the flowering plant tree.

    Letters represent previously identified WGDs. Estimated gene family phylogenies including genes on syntenic blocks corresponding to the σ and τ WGDs were queried to identify the timing of implied gene duplications relative to speciation events. The numbers below each lineage in the monocot clade represent gene duplication events corresponding to the σ (green) and t (purple) synteny blocks. Trees with inferred duplication events supported by greater than 80% (left) and between 80% and 50% (right) bootstrap support values are shown for each node. Taxon names are color coded as in Figure 2.

  13. Property of leaf green tip gene interaction network.
    Supplementary Fig. 9: Property of leaf green tip gene interaction network.

    (a,c,d) Distributions of the node degree, diameter and betweenness attribute. (b) Relationship between node degree and frequency in logarithmic coordinates.

  14. Schematic workflow of the pineapple genome assembly and improvement.
    Supplementary Fig. 10: Schematic workflow of the pineapple genome assembly and improvement.
  15. k-mer coverage of the F153 fragment library (k = 23).
    Supplementary Fig. 11: k-mer coverage of the F153 fragment library (k = 23).

References

  1. Clement, C.R., de Cristo-Araújo, M., Coppens D'Eeckenbrugge, G., Alves Pereira, A. & Picanço-Rodrigues, D. Origin and domestication of native Amazonian crops. Diversity 2, 72106 (2010).
  2. Bartholomew, D.P., Paull, R.E. & Rohrbach, K.G. The Pineapple: Botany, Production, and Uses (CABI, 2002).
  3. Beauman, F. The Pineapple: King of Fruits (Random House, 2006).
  4. Yang, X. et al. A roadmap for research on crassulacean acid metabolism (CAM) to enhance. New Phytol. 207, 491504 (2015).
  5. Brewbaker, J.L. & Gorrez, D.D. Genetics of self-incompatibility in the monocot genera, Ananas (pineapple) and Gasteria. Am. J. Bot. 54, 611616 (1967).
  6. Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L.L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437453 (2015).
  7. Givnish, T.J. et al. Adaptive radiation, correlated and contingent evolution, and net species diversification in Bromeliaceae. Mol. Phylogenet. Evol. 71, 5578 (2014).
  8. Arumuganathan, K. & Earle, E. Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep. 9, 208218 (1991).
  9. Cantarel, B.L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188196 (2008).
  10. McCarthy, E.M. & McDonald, J.F. LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19, 362367 (2003).
  11. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265W268 (2007).
  12. Meyers, B.C., Tingey, S.V. & Morgante, M. Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. Genome Res. 11, 16601676 (2001).
  13. Tang, H., Bowers, J.E., Wang, X. & Paterson, A.H. Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc. Natl. Acad. Sci. USA 107, 472477 (2010).
  14. Paterson, A.H., Bowers, J.E. & Chapman, B.A. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. USA 101, 99039908 (2004).
  15. Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463467 (2007).
  16. Jiao, Y., Li, J., Tang, H. & Paterson, A.H. Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots. Plant Cell Online 26, 27922802 (2014).
  17. Wang, W. et al. The Spirodela polyrhiza genome reveals insights into its neotenous reduction fast growth and aquatic lifestyle. Nat. Commun. 5, 3311 (2014).
  18. D'Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213217 (2012).
  19. Amborella Genome Project. The Amborella genome and the evolution of flowering plants. Science 342, 1241089 (2013).
  20. Cai, J. et al. The genome sequence of the orchid Phalaenopsis equestris. Nat. Genet. 47, 6572 (2015).
  21. Freeling, M. et al. Many or most genes in Arabidopsis transposed after the origin of the order Brassicales. Genome Res. 18, 19241937 (2008).
  22. Woodhouse, M.R. et al. Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homeologs. PLoS Biol. 8, e1000409 (2010).
  23. Woodhouse, M.R., Tang, H. & Freeling, M. Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. Plant Cell Online 23, 42414253 (2011).
  24. Kramer, E.M., Dorit, R.L. & Irish, V.F. Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149, 765783 (1998).
  25. Nam, J. et al. Type I MADS-box genes have experienced faster birth-and-death evolution than type II MADS-box genes in angiosperms. Proc. Natl. Acad. Sci. USA 101, 19101915 (2004).
  26. Chepyshko, H., Lai, C.-P., Huang, L.-M., Liu, J.-H. & Shaw, J.-F. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis. BMC Genomics 13, 309 (2012).
  27. Nobel, P.S. Achievable productivities of certain CAM plants: basis for high values compared with C3 and C4 plants. New Phytol. 119, 183205 (1991).
  28. Osmond, C. Crassulacean acid metabolism: a curiosity in context. Annu. Rev. Plant Physiol. 29, 379414 (1978).
  29. Borland, A.M. et al. Engineering crassulacean acid metabolism to improve water-use efficiency. Trends Plant Sci. 19, 327338 (2014).
  30. Christin, P.-A. et al. Shared origins of a key enzyme during the evolution of C4 and CAM metabolism. J. Exp. Bot. 65, 36093621 (2014).
  31. Edwards, E.J. & Ogburn, R.M. Angiosperm responses to a low-CO2 world: CAM and C4 photosynthesis as parallel evolutionary trajectories. Int. J. Plant Sci. 173, 724733 (2012).
  32. Silvera, K. et al. Evolution along the crassulacean acid metabolism continuum. Funct. Plant Biol. 37, 9951010 (2010).
  33. Dittrich, P., Campbell, W.H. & Black, C. Phosphoenolpyruvate carboxykinase in plants exhibiting crassulacean acid metabolism. Plant Physiol. 52, 357361 (1973).
  34. Carnal, N.W. & Black, C.C. Phosphofructokinase activities in photosynthetic organisms: the occurrence of pyrophosphate-dependent 6-phosphofructokinase in plants and algae. Plant Physiol. 71, 150155 (1983).
  35. McRae, S.R., Christopher, J.T., Smith, J.A.C. & Holtum, J.A. Sucrose transport across the vacuolar membrane of Ananas comosus. Funct. Plant Biol. 29, 717724 (2002).
  36. Antony, E. et al. Cloning, localization and expression analysis of vacuolar sugar transporters in the CAM plant Ananas comosus (pineapple). J. Exp. Bot. 59, 18951908 (2008).
  37. Holtum, J.A., Smith, J.A.C. & Neuhaus, H.E. Intracellular transport and pathways of carbon flow in plants with crassulacean acid metabolism. Funct. Plant Biol. 32, 429449 (2005).
  38. Kenyon, W.H., Severson, R.F. & Black, C.C. Maintenance carbon cycle in crassulacean acid metabolism plant leaves: source and compartmentation of carbon for nocturnal malate synthesis. Plant Physiol. 77, 183189 (1985).
  39. Michael, T.P. et al. Network discovery pipeline elucidates conserved time-of-day–specific cis-regulatory modules. PLoS Genet. 4, e14 (2008).
  40. Wang, X. et al. Comparative genomic analysis of C4 photosynthetic pathway evolution in grasses. Genome Biol. 10, R68 (2009).
  41. Collins, J.L. The Pineapple: Botany, Cultivation and Utilization (Interscience Publishers, 1960).
  42. von Caemmerer, S., Quick, W.P. & Furbank, R.T. The development of C4 rice: current progress and future challenges. Science 336, 16711672 (2012).
  43. Ming, R. et al. Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.). Genome Biol. 14, R41 (2013).
  44. VanBuren, R. et al. Longli is not a hybrid of Longan and Lychee as revealed by genome size analysis and trichome morphology. Trop. Plant Biol. 4, 228236 (2011).
  45. Dolezel, J., Bartos, J., Voglmayr, H. & Greilhuber, J. Nuclear DNA content and genome size of trout and human. Cytometry A 51, 127128, author reply 129 (2003).
  46. Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
  47. Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y.O. & Borodovsky, M. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 33, 64946506 (2005).
  48. Stanke, M., Schoffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62 (2006).
  49. Haas, B.J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
  50. Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644652 (2011).
  51. Skinner, M.E., Uzilov, A.V., Stein, L.D., Mungall, C.J. & Holmes, I.H. JBrowse: a next-generation genome browser. Genome Res. 19, 16301638 (2009).
  52. Zdobnov, E.M. & Apweiler, R. InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847848 (2001).
  53. Kielbasa, S.M., Wan, R., Sato, K., Horton, P. & Frith, M.C. Adaptive seeds tame genomic sequence comparison. Genome Res. 21, 487493 (2011).
  54. Tang, H. et al. Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics 12, 102 (2011).
  55. Al-Dous, E.K. et al. De novo genome sequencing and comparative genomics of date palm (Phoenix dactylifera). Nat. Biotechnol. 29, 521527 (2011).
  56. Singh, R. et al. Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds. Nature 500, 335339 (2013).
  57. International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature 436, 793800 (2005).
  58. Paterson, A.H. et al. The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551556 (2009).
  59. Lyons, E. et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 148, 17721781 (2008).
  60. Mirarab, S., Nguyen, N. & Warnow, T. in Research in Computational Molecular Biology 177191 (Springer, 2015).
  61. Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609W612 (2006).
  62. Stamatakis, A. RAxML-VI-HPC: maximum likelihood–based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 26882690 (2006).
  63. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562578 (2012).
  64. Hudson, M.E. & Quail, P.H. Identification of promoter motifs involved in the network of phytochrome A–regulated gene expression by combined analysis of genomic sequence and microarray data. Plant Physiol. 133, 16051616 (2003).
  65. Franco-Zorrilla, J.M. et al. DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. USA 111, 23672372 (2014).
  66. Michael, T.P. & McClung, C.R. Phase-specific circadian clock regulatory elements in Arabidopsis. Plant Physiol. 130, 627638 (2002).

Download references

Author information

  1. These authors contributed equally to this work.

    • Ray Ming,
    • Robert VanBuren,
    • Ching Man Wai &
    • Haibao Tang

Affiliations

  1. Fujian Agriculture and Forestry University and University of Illinois at Urbana-Champaign–School of Integrative Biology Joint Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, China.

    • Ray Ming,
    • Robert VanBuren,
    • Ching Man Wai,
    • Haibao Tang,
    • Jisen Zhang,
    • Lixian Huang,
    • Lingmao Zhang,
    • Wenjing Miao,
    • Jian Zhang,
    • Zhangyao Ye,
    • Chenyong Miao,
    • Zhicong Lin,
    • Zhenyang Liao,
    • Jingping Fang,
    • Juan Liu,
    • Xiaodan Zhang,
    • Qing Zhang,
    • Weichang Hu,
    • Yuan Qin,
    • Kai Wang &
    • Li-Yu Chen
  2. Fujian-Taiwan Joint Center for Ecological Control of Crop Pests, Fujian Agriculture and Forestry University, Fuzhou, China.

    • Ray Ming,
    • Robert VanBuren,
    • Ching Man Wai,
    • Haibao Tang,
    • Jisen Zhang,
    • Lixian Huang,
    • Lingmao Zhang,
    • Wenjing Miao,
    • Jian Zhang,
    • Zhangyao Ye,
    • Chenyong Miao,
    • Zhicong Lin,
    • Zhenyang Liao,
    • Jingping Fang,
    • Juan Liu,
    • Xiaodan Zhang,
    • Qing Zhang,
    • Weichang Hu,
    • Yuan Qin,
    • Kai Wang &
    • Li-Yu Chen
  3. Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.

    • Ray Ming,
    • Robert VanBuren,
    • Ching Man Wai &
    • Katy Heath
  4. Donald Danforth Plant Science Center, St. Louis, Missouri, USA.

    • Robert VanBuren,
    • Henry D Priest,
    • Michael R McKain &
    • Todd Mockler
  5. iPlant Collaborative/University of Arizona, Tucson, Arizona, USA.

    • Haibao Tang &
    • Eric Lyons
  6. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA.

    • Michael C Schatz,
    • Eric Biggers,
    • Hayan Lee,
    • James Gurtowski &
    • Fritz J Sedlazeck
  7. Department of Plant Biology, University of Georgia, Athens, Georgia, USA.

    • John E Bowers,
    • Hao Wang,
    • Hongye Zhou,
    • Alex Harkess,
    • James H Leebens-Mack &
    • Jeffrey L Bennetzen
  8. Hawaii Agriculture Research Center, Kunia, Hawaii, USA.

    • Ming-Li Wang &
    • Paul H Moore
  9. Department of Tropical Plant and Soil Sciences, University of Hawaii, Honolulu, Hawaii, USA.

    • Jung Chen &
    • Robert E Paull
  10. Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada, USA.

    • Won C Yim &
    • John C Cushman
  11. Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada.

    • Chunfang Zheng &
    • David Sankoff
  12. Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, USA.

    • Margaret Woodhouse,
    • Patrick P Edger &
    • Michael Freeling
  13. Institut de Recherche pour le Développement, Diversité Adaptation et Développement des Plantes, Montpellier, France.

    • Romain Guyot
  14. Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, USA.

    • Hao-Bo Guo &
    • Hong Guo
  15. Key Laboratory of Computational Biology, Chinese Academy of Sciences–Max Planck Gesellschaft Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.

    • Guangyong Zheng &
    • Xinguang Zhu
  16. Texas A&M AgriLife Research, Department of Plant Pathology and Microbiology, Texas A&M University System, Dallas, Texas, USA.

    • Ratnesh Singh,
    • Anupma Sharma &
    • Qingyi Yu
  17. Department of Biological Sciences, Youngstown State University, Youngstown, Ohio, USA.

    • Xiangjia Min
  18. Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, China.

    • Yun Zheng
  19. Australian Research Council (ARC) Centre of Excellence in Plant Cell Walls, School of Agriculture, Food and Wine, University of Adelaide, Waite Campus Urrbrae, Adelaide, South Australia, Australia.

    • Neil Shirley &
    • Vincent Bulone
  20. Department of Agronomy, National Taiwan University, Taipei, Taiwan.

    • Yann-Rong Lin &
    • Li-Yu Liu
  21. W.M. Keck Center, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.

    • Alvaro G Hernandez &
    • Chris L Wright
  22. Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.

    • Gerald A Tuskan &
    • Xiaohan Yang
  23. US Department of Agriculture–Agricultural Research Service (USDA-ARS), Pacific Basin Agricultural Research Center, Hilo, Hawaii, USA.

    • Francis Zee
  24. Department of Biochemistry and Molecular Biology, Noble Research Center, Oklahoma State University, Stillwater, Oklahoma, USA.

    • Ramanjulu Sunkar
  25. Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia, USA.

    • Andrew H Paterson
  26. Department of Plant Sciences, University of Oxford, Oxford, UK.

    • J Andrew C Smith

Contributions

R.M., Q.Y., R.E.P., P.H.M., R.V. and C.M.W. conceived the experiments. L.H., L.Z., W.M., A.G.H. and C.L.W. sequenced the genomes. M.C.S., E.B., H.L., J.G. and F.J.S. assembled the genome. H.T., C.M. and Z.Y. annotated the genome. R.M., R.V., C.M.W., J.E.B., E.L., M.-L.W., J.C., Jisen Zhang, Z. Lin, Jian Zhang, H.W., H.Z., W.C.Y., H.D.P., C.Z., M.W., P.P.E., R.G., H.-B.G., H.G., G.Z., R. Singh, A.S., X.M., Y.Z., A.H., M.R.M., Z. Liao, J.F., J.L., X. Zhang, Q.Z., W.H., Y.Q., K.W., L.-Y.C., N.S., Y.-R.L., L.-Y.L., V.B., G.A.T., K.H., F.Z., R. Sunkar, J.H.L.-M., T.M., J.L.B., M.F., D.S., A.H.P., X. Zhu, X.Y., J.A.C.S., J.C.C., R.E.P. and Q.Y. analyzed the genomes. R.M., R.V., C.M.W., H.T., M.C.S., D.S., M.W., M.F., X. Zhu, X.Y., J.A.C.S. and J.C.C. wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

Supplementary Figures

  1. Supplementary Figure 1: Correlation between family copy number and expression level of LTR elements. (34 KB)

    The data indicate that high expression levels of LTR elements are correlated with a relatively low copy number of their family.

  2. Supplementary Figure 2: Expression of intact LTR retrotransposons in nine pineapple tissue samples. (113 KB)

    This heat map shows the number of RNA-seq reads mapped to the top 40 most highly expressed LTR retrotransposon families. Family names are shown as row labels, and tissue names are given as column labels. From top to bottom, the rows are sorted by total counts of mapped reads in families.

  3. Supplementary Figure 3: Expression of subfamilies of LTR retrotransposons in nine pineapple tissue samples. (93 KB)

    The heat map shows the number of RNA-seq reads mapped to the top ten most highly expressed LTR retrotransposon families. Each row represents a subfamily, and each column represents a tissue. The numbers following family names give subfamily IDs. From top to bottom, the rows are sorted by total counts of mapped reads in families. Within each family, the rows are further sorted by total counts of mapped reads in subfamilies.

  4. Supplementary Figure 4: Synonymous substitutions per site (Ks) values between inferred whole-genome duplicates in pineapple. (93 KB)

    (a) Syntenic dot plot in pineapple versus pineapple comparison, with Ks values color coded; only the gene pairs with a Ks value between 0 and 2 are plotted. (b) Histogram of Ks values for pineapple-rice orthologs, rice whole-genome duplicates and pineapple whole-genome duplicates.

  5. Supplementary Figure 5: Pairwise genome comparisons between pineapple and ten related plant species. (267 KB)

    Pairwise comparisons (dot plots) between pineapple (y axis) and a total of ten related plant genomes (x axis), including (aj) Amborella, banana, date palm, duckweed, grape, oil palm, orchid, pineapple (i.e., self-comparison), rice and sorghum. For clarity, only gene pairs within synteny blocks of at least size 4 are shown.

  6. Supplementary Figure 6: Microsynteny fractionation for 4:1 pineapple to Amborella, providing evidence that pineapple has undergone two WGDs in its lineage since their divergence. (227 KB)

    Five exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in rice and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e426, https://genomevolution.org/r/e428, https://genomevolution.org/r/e427, https://genomevolution.org/r/e448 and https://genomevolution.org/r/e446.

  7. Supplementary Figure 7: Microsynteny fractionation for 1:2 pineapple to rice, providing evidence that rice has undergone one WGD in its lineage (ρ) since its divergence from pineapple. (156 KB)

    Three exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in Amborella and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e3kg, https://genomevolution.org/r/e3kw, https://genomevolution.org/r/e3k4.

  8. Supplementary Figure 8: Dating of whole-genome duplication (WGD) events on the flowering plant tree. (42 KB)

    Letters represent previously identified WGDs. Estimated gene family phylogenies including genes on syntenic blocks corresponding to the σ and τ WGDs were queried to identify the timing of implied gene duplications relative to speciation events. The numbers below each lineage in the monocot clade represent gene duplication events corresponding to the σ (green) and t (purple) synteny blocks. Trees with inferred duplication events supported by greater than 80% (left) and between 80% and 50% (right) bootstrap support values are shown for each node. Taxon names are color coded as in Figure 2.

  9. Supplementary Figure 9: Property of leaf green tip gene interaction network. (60 KB)

    (a,c,d) Distributions of the node degree, diameter and betweenness attribute. (b) Relationship between node degree and frequency in logarithmic coordinates.

  10. Supplementary Figure 10: Schematic workflow of the pineapple genome assembly and improvement. (114 KB)
  11. Supplementary Figure 11: k-mer coverage of the F153 fragment library (k = 23). (23 KB)

PDF files

  1. Supplementary Text and Figures (2,226 KB)

    Supplementary Figures 1–11, Supplementary Tables 1–4 and 6–17, and Supplementary Note.

Excel files

  1. Supplementary Table 5 (2,117 KB)

    Summary of gene model annotations.

Additional data