Pineapple (Ananas comosus (L.) Merr.) is the most economically valuable crop possessing crassulacean acid metabolism (CAM), a photosynthetic carbon assimilation pathway with high water-use efficiency, and the second most important tropical fruit. We sequenced the genomes of pineapple varieties F153 and MD2 and a wild pineapple relative, Ananas bracteatus accession CB5. The pineapple genome has one fewer ancient whole-genome duplication event than sequenced grass genomes and a conserved karyotype with seven chromosomes from before the ρ duplication event. The pineapple lineage has transitioned from C3 photosynthesis to CAM, with CAM-related genes exhibiting a diel expression pattern in photosynthetic tissues. CAM pathway genes were enriched with cis-regulatory elements associated with the regulation of circadian clock genes, providing the first cis-regulatory link between CAM and circadian clock regulation. Pineapple CAM photosynthesis evolved by the reconfiguration of pathways in C3 plants, through the regulatory neofunctionalization of preexisting genes and not through the acquisition of neofunctionalized genes via whole-genome or tandem gene duplication.
At a glance
- Origin and domestication of native Amazonian crops. Diversity 2, 72–106 (2010). , , , &
- The Pineapple: Botany, Production, and Uses (CABI, 2002). , &
- The Pineapple: King of Fruits (Random House, 2006).
- A roadmap for research on crassulacean acid metabolism (CAM) to enhance. New Phytol. 207, 491–504 (2015). et al.
- Genetics of self-incompatibility in the monocot genera, Ananas (pineapple) and Gasteria. Am. J. Bot. 54, 611–616 (1967). &
- A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015). , , &
- Adaptive radiation, correlated and contingent evolution, and net species diversification in Bromeliaceae. Mol. Phylogenet. Evol. 71, 55–78 (2014). et al.
- Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep. 9, 208–218 (1991). &
- MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188–196 (2008). et al.
- LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19, 362–367 (2003). &
- LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007). &
- Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. Genome Res. 11, 1660–1676 (2001). , &
- Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc. Natl. Acad. Sci. USA 107, 472–477 (2010). , , &
- Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. USA 101, 9903–9908 (2004). , &
- The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007). et al.
- Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots. Plant Cell Online 26, 2792–2802 (2014). , , &
- The Spirodela polyrhiza genome reveals insights into its neotenous reduction fast growth and aquatic lifestyle. Nat. Commun. 5, 3311 (2014). et al.
- The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012). et al.
- Amborella Genome Project. The Amborella genome and the evolution of flowering plants. Science 342, 1241089 (2013).
- The genome sequence of the orchid Phalaenopsis equestris. Nat. Genet. 47, 65–72 (2015). et al.
- Many or most genes in Arabidopsis transposed after the origin of the order Brassicales. Genome Res. 18, 1924–1937 (2008). et al.
- Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homeologs. PLoS Biol. 8, e1000409 (2010). et al.
- Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. Plant Cell Online 23, 4241–4253 (2011). , &
- Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149, 765–783 (1998). , &
- Type I MADS-box genes have experienced faster birth-and-death evolution than type II MADS-box genes in angiosperms. Proc. Natl. Acad. Sci. USA 101, 1910–1915 (2004). et al.
- Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis. BMC Genomics 13, 309 (2012). , , , &
- Achievable productivities of certain CAM plants: basis for high values compared with C3 and C4 plants. New Phytol. 119, 183–205 (1991).
- Crassulacean acid metabolism: a curiosity in context. Annu. Rev. Plant Physiol. 29, 379–414 (1978).
- Engineering crassulacean acid metabolism to improve water-use efficiency. Trends Plant Sci. 19, 327–338 (2014). et al.
- Shared origins of a key enzyme during the evolution of C4 and CAM metabolism. J. Exp. Bot. 65, 3609–3621 (2014). et al.
- Angiosperm responses to a low-CO2 world: CAM and C4 photosynthesis as parallel evolutionary trajectories. Int. J. Plant Sci. 173, 724–733 (2012). &
- Evolution along the crassulacean acid metabolism continuum. Funct. Plant Biol. 37, 995–1010 (2010). et al.
- Phosphoenolpyruvate carboxykinase in plants exhibiting crassulacean acid metabolism. Plant Physiol. 52, 357–361 (1973). , &
- Phosphofructokinase activities in photosynthetic organisms: the occurrence of pyrophosphate-dependent 6-phosphofructokinase in plants and algae. Plant Physiol. 71, 150–155 (1983). &
- Sucrose transport across the vacuolar membrane of Ananas comosus. Funct. Plant Biol. 29, 717–724 (2002). , , &
- Cloning, localization and expression analysis of vacuolar sugar transporters in the CAM plant Ananas comosus (pineapple). J. Exp. Bot. 59, 1895–1908 (2008). et al.
- Intracellular transport and pathways of carbon flow in plants with crassulacean acid metabolism. Funct. Plant Biol. 32, 429–449 (2005). , &
- Maintenance carbon cycle in crassulacean acid metabolism plant leaves: source and compartmentation of carbon for nocturnal malate synthesis. Plant Physiol. 77, 183–189 (1985). , &
- Network discovery pipeline elucidates conserved time-of-day–specific cis-regulatory modules. PLoS Genet. 4, e14 (2008). et al.
- Comparative genomic analysis of C4 photosynthetic pathway evolution in grasses. Genome Biol. 10, R68 (2009). et al.
- The Pineapple: Botany, Cultivation and Utilization (Interscience Publishers, 1960).
- The development of C4 rice: current progress and future challenges. Science 336, 1671–1672 (2012). , &
- Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.). Genome Biol. 14, R41 (2013). et al.
- Longli is not a hybrid of Longan and Lychee as revealed by genome size analysis and trichome morphology. Trop. Plant Biol. 4, 228–236 (2011). et al.
- Nuclear DNA content and genome size of trout and human. Cytometry A 51, 127–128, author reply 129 (2003). , , &
- Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
- Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 33, 6494–6506 (2005). , , &
- Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62 (2006). , , &
- Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008). et al.
- Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011). et al.
- JBrowse: a next-generation genome browser. Genome Res. 19, 1630–1638 (2009). , , , &
- InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848 (2001). &
- Adaptive seeds tame genomic sequence comparison. Genome Res. 21, 487–493 (2011). , , , &
- Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics 12, 102 (2011). et al.
- De novo genome sequencing and comparative genomics of date palm (Phoenix dactylifera). Nat. Biotechnol. 29, 521–527 (2011). et al.
- Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds. Nature 500, 335–339 (2013). et al.
- International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature 436, 793–800 (2005).
- The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551–556 (2009). et al.
- Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 148, 1772–1781 (2008). et al.
- Research in Computational Molecular Biology 177–191 (Springer, 2015). , & in
- PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–W612 (2006). , &
- RAxML-VI-HPC: maximum likelihood–based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).
- Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012). et al.
- Identification of promoter motifs involved in the network of phytochrome A–regulated gene expression by combined analysis of genomic sequence and microarray data. Plant Physiol. 133, 1605–1616 (2003). &
- DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. USA 111, 2367–2372 (2014). et al.
- Phase-specific circadian clock regulatory elements in Arabidopsis. Plant Physiol. 130, 627–638 (2002). &
- Supplementary Figure 1: Correlation between family copy number and expression level of LTR elements. (34 KB)
The data indicate that high expression levels of LTR elements are correlated with a relatively low copy number of their family.
- Supplementary Figure 2: Expression of intact LTR retrotransposons in nine pineapple tissue samples. (113 KB)
This heat map shows the number of RNA-seq reads mapped to the top 40 most highly expressed LTR retrotransposon families. Family names are shown as row labels, and tissue names are given as column labels. From top to bottom, the rows are sorted by total counts of mapped reads in families.
- Supplementary Figure 3: Expression of subfamilies of LTR retrotransposons in nine pineapple tissue samples. (93 KB)
The heat map shows the number of RNA-seq reads mapped to the top ten most highly expressed LTR retrotransposon families. Each row represents a subfamily, and each column represents a tissue. The numbers following family names give subfamily IDs. From top to bottom, the rows are sorted by total counts of mapped reads in families. Within each family, the rows are further sorted by total counts of mapped reads in subfamilies.
- Supplementary Figure 4: Synonymous substitutions per site (Ks) values between inferred whole-genome duplicates in pineapple. (93 KB)
(a) Syntenic dot plot in pineapple versus pineapple comparison, with Ks values color coded; only the gene pairs with a Ks value between 0 and 2 are plotted. (b) Histogram of Ks values for pineapple-rice orthologs, rice whole-genome duplicates and pineapple whole-genome duplicates.
- Supplementary Figure 5: Pairwise genome comparisons between pineapple and ten related plant species. (267 KB)
Pairwise comparisons (dot plots) between pineapple (y axis) and a total of ten related plant genomes (x axis), including (a–j) Amborella, banana, date palm, duckweed, grape, oil palm, orchid, pineapple (i.e., self-comparison), rice and sorghum. For clarity, only gene pairs within synteny blocks of at least size 4 are shown.
- Supplementary Figure 6: Microsynteny fractionation for 4:1 pineapple to Amborella, providing evidence that pineapple has undergone two WGDs in its lineage since their divergence. (227 KB)
Five exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in rice and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e426, https://genomevolution.org/r/e428, https://genomevolution.org/r/e427, https://genomevolution.org/r/e448 and https://genomevolution.org/r/e446.
- Supplementary Figure 7: Microsynteny fractionation for 1:2 pineapple to rice, providing evidence that rice has undergone one WGD in its lineage (ρ) since its divergence from pineapple. (156 KB)
Three exemplar regions are shown. Each panel contains multiple parallel tracks representing syntenic regions in Amborella and pineapple. Connecting lines show sequence similarities between the regions. CoGe, https://genomevolution.org/r/e3kg, https://genomevolution.org/r/e3kw, https://genomevolution.org/r/e3k4.
- Supplementary Figure 8: Dating of whole-genome duplication (WGD) events on the flowering plant tree. (42 KB)
Letters represent previously identified WGDs. Estimated gene family phylogenies including genes on syntenic blocks corresponding to the σ and τ WGDs were queried to identify the timing of implied gene duplications relative to speciation events. The numbers below each lineage in the monocot clade represent gene duplication events corresponding to the σ (green) and t (purple) synteny blocks. Trees with inferred duplication events supported by greater than 80% (left) and between 80% and 50% (right) bootstrap support values are shown for each node. Taxon names are color coded as in Figure 2.
- Supplementary Figure 9: Property of leaf green tip gene interaction network. (60 KB)
(a,c,d) Distributions of the node degree, diameter and betweenness attribute. (b) Relationship between node degree and frequency in logarithmic coordinates.
- Supplementary Text and Figures (2,226 KB)
Supplementary Figures 1–11, Supplementary Tables 1–4 and 6–17, and Supplementary Note.
- Supplementary Table 5 (2,117 KB)
Summary of gene model annotations.