The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

Yang, Xiaohan; Hu, Rongbin; Yin, Hengfu; Jenkins, Jerry; Shu, Shengqiang; Tang, Haibao; Liu, Degao; Weighill, Deborah A.; Cheol Yim, Won; Ha, Jungmin; Heyduk, Karolina; Goodstein, David M.; Guo, Hao-Bo; Moseley, Robert C.; Fitzek, Elisabeth; Jawdy, Sara; Zhang, Zhihao; Xie, Meng; Hartwell, James; Grimwood, Jane; Abraham, Paul E.; Mewalal, Ritesh; Beltrán, Juan D.; Boxall, Susanna F.; Dever, Louisa V.; Palla, Kaitlin J.; Albion, Rebecca; Garcia, Travis; Mayer, Jesse A.; Don Lim, Sung; Man Wai, Ching; Peluso, Paul; Van Buren, Robert; De Paoli, Henrique Cestari; Borland, Anne M.; Guo, Hong; Chen, Jin-Gui; Muchero, Wellington; Yin, Yanbin; Jacobson, Daniel A.; Tschaplinski, Timothy J.; Hettich, Robert L.; Ming, Ray; Winter, Klaus; Leebens-Mack, James H.; Smith, J. Andrew C.; Cushman, John C.; Schmutz, Jeremy; Tuskan, Gerald A.

doi:10.1038/s41467-017-01491-7

Download PDF

Article
Open access
Published: 01 December 2017

The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

Xiaohan Yang ORCID: orcid.org/0000-0001-5207-4210^1,2,
Rongbin Hu ORCID: orcid.org/0000-0001-5921-6891¹,
Hengfu Yin ORCID: orcid.org/0000-0002-0720-5311¹,
Jerry Jenkins³,
Shengqiang Shu⁴,
Haibao Tang ORCID: orcid.org/0000-0002-3460-8570⁵,
Degao Liu¹,
Deborah A. Weighill ORCID: orcid.org/0000-0003-4979-5871^1,2,
Won Cheol Yim ORCID: orcid.org/0000-0002-7489-0435⁶,
Jungmin Ha⁶,
Karolina Heyduk⁷,
David M. Goodstein⁴,
Hao-Bo Guo⁸,
Robert C. Moseley^1,2,
Elisabeth Fitzek⁹,
Sara Jawdy¹,
Zhihao Zhang¹,
Meng Xie ORCID: orcid.org/0000-0003-0247-3701¹,
James Hartwell ORCID: orcid.org/0000-0001-5000-223X¹⁰,
Jane Grimwood³,
Paul E. Abraham ORCID: orcid.org/0000-0002-0153-2434¹¹,
Ritesh Mewalal ORCID: orcid.org/0000-0002-0153-2434¹,
Juan D. Beltrán¹²,
Susanna F. Boxall¹⁰,
Louisa V. Dever¹⁰,
Kaitlin J. Palla^1,2,
Rebecca Albion⁶,
Travis Garcia⁶,
Jesse A. Mayer⁶,
Sung Don Lim⁶,
Ching Man Wai¹³,
Paul Peluso¹⁴,
Robert Van Buren¹⁵,
Henrique Cestari De Paoli ORCID: orcid.org/0000-0001-8494-0603^1,16,
Anne M. Borland^1,17,
Hong Guo⁸,
Jin-Gui Chen ORCID: orcid.org/0000-0002-1752-4201¹,
Wellington Muchero¹,
Yanbin Yin⁹,
Daniel A. Jacobson ORCID: orcid.org/0000-0002-9822-8251^1,2,
Timothy J. Tschaplinski ORCID: orcid.org/0000-0002-9540-6622¹,
Robert L. Hettich¹¹,
Ray Ming^5,13,
Klaus Winter¹⁸,
James H. Leebens-Mack⁷,
J. Andrew C. Smith¹²,
John C. Cushman⁶,
Jeremy Schmutz^3,4 &
…
Gerald A. Tuskan¹

Nature Communications volume 8, Article number: 1899 (2017) Cite this article

16k Accesses
114 Citations
149 Altmetric
Metrics details

Subjects

Abstract

Crassulacean acid metabolism (CAM) is a water-use efficient adaptation of photosynthesis that has evolved independently many times in diverse lineages of flowering plants. We hypothesize that convergent evolution of protein sequence and temporal gene expression underpins the independent emergences of CAM from C₃ photosynthesis. To test this hypothesis, we generate a de novo genome assembly and genome-wide transcript expression data for Kalanchoë fedtschenkoi, an obligate CAM species within the core eudicots with a relatively small genome (~260 Mb). Our comparative analyses identify signatures of convergence in protein sequence and re-scheduling of diel transcript expression of genes involved in nocturnal CO₂ fixation, stomatal movement, heat tolerance, circadian clock, and carbohydrate metabolism in K. fedtschenkoi and other CAM species in comparison with non-CAM species. These findings provide new insights into molecular convergence and building blocks of CAM and will facilitate CAM-into-C₃ photosynthesis engineering to enhance water-use efficiency in crops.

The role of cis-elements in the evolution of crassulacean acid metabolism photosynthesis

Article Open access 01 January 2020

The coordination of major events in C4 photosynthesis evolution in the genus Flaveria

Article Open access 02 August 2021

Underwater CAM photosynthesis elucidated by Isoetes genome

Article Open access 03 November 2021

Introduction

Crassulacean acid metabolism (CAM) is a metabolic adaptation of photosynthetic CO₂ fixation that enhances plant water-use efficiency (WUE) and associated drought avoidance/tolerance by reducing transpirational water loss through stomatal closure during the day, when temperatures are high, and stomatal opening during the night, when temperatures are lower¹. In the face of the rapidly increasing human population and global warming predicted over the next century, the outstanding WUE of CAM plants highlights the potential of the CAM pathway for sustainable food and biomass production on semi-arid, abandoned, or marginal agricultural lands^2,3,4.

CAM photosynthesis can be divided into two major phases: (1) nocturnal uptake of atmospheric CO₂ through open stomata and primary fixation of CO₂ by phosphoenolpyruvate carboxylase (PEPC) to oxaloacetate (OAA) and its subsequent conversion to malic acid by malate dehydrogenase; and (2) daytime decarboxylation of malate and CO₂ refixation via C₃ photosynthesis, mediated by ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO)^5,6. Malic acid is stored in the vacuole of photosynthetically active cells reaching a peak at dawn and can be used as a reference point to divide the two phases. CAM is found in over 400 genera across 36 families of vascular plants⁴ and is thought to have evolved multiple times independently from diverse ancestral C₃ photosynthesis lineages⁷. The core biochemical characteristics of the CAM cycle are similar in all the plant lineages in which CAM has evolved, with some variation in the enzymes that catalyze malate decarboxylation during the day, and in the storage carbohydrates that provide substrates for malic acid synthesis at night^8,9.

We hypothesize that convergent evolution in protein sequence and/or temporal diel gene expression underpins the multiple and independent emergences of CAM from C₃ photosynthesis. Convergent evolution is generally defined as the appearance of similar phenotypes in distinct evolutionary lineages¹⁰. Although phenotypic convergence is widely recognized, its evolutionary mechanism has been extensively debated. Morris¹¹ argues that the evolutionary course is not random but selection-constrained, along certain pathways, to arrive at the same solution or outcome. Recently, comparative genomics analysis began to provide new insight into the molecular mechanism of convergent evolution. For example, Foote et al.¹² performed comparative genomic analyses of three species of marine mammals (the killer whale, walrus, and manatee) that share independently evolved phenotypic adaptations to a marine existence, and identified convergent amino-acid substitutions in genes evolving under positive selection and putatively associated with a marine phenotype. Also, Hu et al.¹³ compared the genomes of the bamboo-eating giant and red pandas, two obligate bamboo-feeders that independently possess adaptive pseudothumbs, and identified 70 adaptively convergent genes (i.e., under positive selection in these two species), of which nine genes, featuring nonrandom convergent amino-acid substitution between giant and red pandas, are closely related to limb development and essential nutrient utilization. These two examples indicate that specific amino-acid replacements at a small number of key sites can result in highly predictable convergent outcomes, supporting the constrained selection theory of Morris¹¹. However, such predictable protein sequence convergence was not found in the convergence of hemoglobin function in high-altitude-dwelling birds, indicating that possible adaptive solutions are perhaps contingent upon prior evolutionary history¹⁴. This finding supports the contingent adaptation theory¹⁵ that evolution is contingent upon history and consequently replaying life’s tape will give different outcomes. In addition to protein sequence convergence, convergent changes in gene expression were found to be associated with convergent evolution of vocal learning in the brains of humans and song-learning birds¹⁶. Therefore, convergent changes in both protein sequence and gene expression are important aspects of the molecular basis of convergent evolution.

We sought to investigate whether changes in protein sequence and/or gene expression contribute to the evolutionary convergence of CAM through genome-wide screening for signatures of convergent changes in protein sequences and diel mRNA expression patterns that meet the following criteria: the signatures are (1) isomorphic in the CAM genomes of distant groups, such as eudicots and monocots, which diverged ~135 million years ago¹⁷, and (2) dimorphic in related C₃ photosynthesis genomes. Recently, the genome sequences of two monocot CAM species, Ananas comosus (L.) Merr. (pineapple)¹⁸, and Phalaenopsis equestris (Schauer) Rchb.f. (moth orchid)¹⁹, were published. Here we present the genome sequence of Kalanchoë fedtschenkoi Raym.-Hamet & H. Perrier, which is an emerging molecular genetic model for obligate CAM species in the eudicots^4,6,20. Our analyses reveal the genomic signatures of convergence shared between eudicot (represented by Kalanchoë) and monocot (represented by pineapple and orchid) CAM species.

Results

Kalanchoë genome assembly and annotation

The diploid K. fedtschenkoi (2n = 2x = 34 chromosomes; Supplementary Fig. 1) genome size was estimated to be ~260 Mb (Supplementary Table 1). The K. fedtschenkoi genome was assembled from ~70× paired-end reads and ~37× mate-pair reads generated using an Illumina MiSeq platform (Supplementary Table 2 and Supplementary Fig. 2). The genome assembly consisted of 1324 scaffolds with a total length of 256 Mb and scaffold N50 of 2.45 Mb (Supplementary Table 3), in which we predicted and annotated 30,964 protein-coding genes (Supplementary Table 4).

The phylogenetic placement of Kalanchoë

Kalanchoë is the first eudicot CAM lineage with a genome sequence to date and serves as an important reference for understanding the evolution of CAM. In addition, K. fedtschenkoi is the first sequenced species in the distinct eudicot lineage, Saxifragales. Although the monophyly of this morphologically diverse order is well supported by molecular data, its phylogenetic placement has been less clear^21,22. The recent consensus view, based mainly on analyses of plastid DNA sequences, has placed the Saxifragales as a sister group to the rosids, and together they comprise the large clade of superrosids^23,24. However, there have been indications of conflict between trees based on plastid genomes and nuclear genomes for this clade^19,24. Additionally, the major lineages of core eudicots are thought to have diversified rapidly following their first appearance, making resolution of the relationships among these clades particularly challenging^17,25 and implicating incomplete lineage sorting (ILS) as a potentially important process that would result in discordance among gene histories²⁶.

We performed phylogenetic analyses with 210 single-copy nuclear genes from 26 sequenced plant genomes using multiple phylogenetic inference strategies. The resulting species trees are congruent with each other except for the placement of K. fedtschenkoi, which was placed either as sister to the rosids in a phylogenetic tree reconstructed using a quartet-based coalescent species tree method (Fig. 1) or as sister to all other core eudicots as revealed by alternative phylogenetic trees reconstructed from (1) concatenated protein sequence alignment without gene partition using maximum-likelihood (Supplementary Fig. 3), (2) a partitioned analysis of multi-gene alignment using maximum-likelihood and Bayesian methods (Supplementary Fig. 4), and (3) analysis of individual gene trees using fully Bayesian multispecies coalescent method (Supplementary Fig. 5). Despite substantial discordance among estimated nuclear gene trees, the coalescence-based tree was consistent with the results of the plastome-based analyses, placing Kalanchoë as sister to the rosids (Fig. 1). Coalescent species tree estimation can account for gene tree discordance due to ILS²⁷. At the same time, alternative placements of Kalanchoë as sister to the asterids, or as sister to all other core eudicots were observed in many gene trees (Fig. 1 and Supplementary Fig. 5). Gene tree discordance due to rapid diversification early in eudicot history has also been characterized by others²⁴. Regardless of the optimal placement of the Saxifragales, including Kalanchoë, individual gene trees will often have alternative histories due to ILS in the face of rapid species diversification.

Kalanchoë genome duplication

The grape genome has no additional genome duplication after the ancestral gamma hexaploidization^28,29 and is the best available reference for studying ancestral eudicot genome duplication events. Syntenic depth analyses^30,31 showed that there are multiple K. fedtschenkoi blocks covering each grape gene (Fig. 2a and Supplementary Fig. 6). Specifically, 65% of the grape genome had from one to four syntenic blocks in K. fedtschenkoi. In contrast, a sudden drop in syntenic depth occurred after a depth of 4× (Fig. 2a), indicating that each grape genome region has up to four K. fedtschenkoi blocks and thus providing strong evidence for two distinct whole-genome duplications (WGDs) events in K. fedtschenkoi. The microsynteny patterns further support two WGDs on the lineages leading to K. fedtschenkoi. Specifically, the microsynteny pattern reflects a 1:4 gene copy ratio between the grape genome and the diploid K. fedtschenkoi genome (Fig. 2b).

From the Kalanchoë point of view, we found that 49% of the Kalanchoë genome was covered by one grape-Kalanchoë block, 7% covered in two grape-Kalanchoë blocks, and 1% covered in three grape-Kalanchoë blocks (Supplementary Fig. 7). This suggests that we could often find one best grape-Kalanchoë block out of the three gamma triplicated regions in grape. This fits the scenario that the gamma WGD predated the divergence and there has been no WGD in the grape lineage since grape-Kalanchoë diverged. Alternatively, if the divergence predated the gamma WGD, then from the Kalanchoë point of view we should instead see three matching grape regions. Hence, the grape-Kalanchoë genome comparisons strongly supported the gamma WGD as a shared event, and further supported the phylogenetic position of Kalanchoë in Fig. 1.

Despite two apparent WGDs in the K. fedtschenkoi lineage, synonymous substitutions per synonymous site (Ks) between duplicate gene pairs showed only one prominent peak ~0.35 (Supplementary Fig. 8). The unimodal distribution of Ks suggests the two WGD events occurring close in time. Similarly, two distinct peaks appear in the distribution of the four-fold transversion substitution rate (4dtv) values between the K. fedtschenkoi gene pairs (Fig. 2c). Grape-Kalanchoë gene pairs show a prominent peak around Ks = 1.5 (Supplementary Fig. 8), indicating that the WGDs in the K. fedtschenkoi lineage occurred well after its divergence from grape early in the history of the rosid lineage.

Gene co-expression modules and clusters in Kalanchoë

To elucidate gene function in K. fedtschenkoi, we performed a weighted correlation network analysis of transcript expression in 16 samples including 12 mature leaf samples collected every 2 h over a 24-h period and four non-leaf samples collected 4 h after the beginning of the light period, including shoot tip (leaf pair 1 plus the apical meristem), stem (between leaf pair 3 and leaf pair 8), root, and flower. Our analysis identified 25 co-expression modules, among which one module (MEblack containing 782 genes) was significantly (Student’s t-test, P < 0.001) associated with the leaf samples collected during the dark period (Supplementary Fig. 9), with an increase in transcript abundance at night (Supplementary Fig. 10). Several biological processes (e.g., carboxylic acid biosynthesis, terpene biosynthesis, and lipid metabolism) were over-represented (hypergeometric enrichment test, P < 0.05) (Supplementary Data 1), and several key genes encoding proteins involved in nocturnal CAM carboxylation and vacuolar uptake of malate such as Kaladp0018s0289 (β-CA), Kaladp0048s0578 (PEPC2), Kaladp0037s0517 (PPCK), Kaladp0022s0111 (MDH), and Kaladp0062s0038 (ALMT6) were present in this module (Fig. 3a, Supplementary Note 1 and Supplementary Table 5). These results suggest that genes in the co-expression module MEblack play important roles in the nighttime processes that define CAM. One alternate module (MEblue containing 1911 genes) was significantly correlated with the leaf samples collected during the day (Supplementary Fig. 9), with an increase in transcript abundance during the light period (Supplementary Fig. 10). Several biological processes (e.g., starch biosynthesis, coenzyme biosynthetic process) were over-represented (hypergeometric enrichment test, P < 0.05) in this module (Supplementary Data 1). One gene in the CAM decarboxylation process, Kaladp0010s0106 (PPDK-RP), belongs to this module (Supplementary Table 6).

We also performed cluster analysis on the CAM leaf time-course expression data for the transcripts that showed significantly (ANOVA of glm models where H₀ = a flat line, P < 0.05) time-structured diel expression patterns as determined by a polynomial regression. Clustering of transcripts with time-structured expression identified 11 clusters (Supplementary Fig. 11 and Supplementary Table 7). Networks constructed for each cluster implicated highly connected hub genes and their direct or indirect interactions with CAM-related genes (Supplementary Data 2). For example, cluster 7, which contains PEPC1 (Kaladp0095s0055) and PPCK2 (Kaladp0604s0001), has a zinc-finger protein CONSTANS-like gene as a central hub (Supplementary Data 2). CONSTANS-like genes are part of the circadian clock regulatory network³². Similarly, multiple REVEILLE transcripts, which encode transcription factors for genes with evening elements in their promoters³³, are hubs in cluster 4 that contains NADP-ME genes (Kaladp0092s0166) (Supplementary Data 2).

Overview of genes that have undergone convergent evolution

To determine the possibility that the diel reprogramming of metabolism that distinguishes CAM from C₃ photosynthesis was achieved, at least in part, by convergent shifts in diel patterns of gene expression, we performed comparative analysis of diel transcript abundance patterns in CAM and C₃ photosynthesis species. Specifically, we compared the diel expression patterns of 9733 ortholog groups of genes from K. fedtschenkoi (eudicot, CAM photosynthesis), A. comosus (monocot, CAM photosynthesis), and Arabidopsis thaliana (eudicot, C₃ photosynthesis), with transcript abundances >0.01 FPKM in mature leaf samples collected at six or more diel time points. Sampling time points included dawn (22, 24, and 2 h from the start of the light period), midday (4, 6, and 8 h from the start of the light period), dusk (10, 12, and 14 h from the start of the light period), and midnight (16, 18, and 20 h from the start of the light period) (Fig. 4a). A gene from K. fedtschenkoi was defined as having undergone convergent evolution of gene expression if it met all of the following criteria: (1) its diel transcript expression pattern was highly correlated (Spearman’s rank correlation coefficient, r > 0.8) with those of at least one of the orthologs in A. comosus, but not highly correlated (r < 0.5) with those of any of the orthologs in A. thaliana; (2) it displayed a significant difference (false discovery rate <0.01) in transcript abundance either between midday and midnight (e.g., Fig. 4b), or between dawn and dusk (e.g., Fig. 4c); and (3) the time shift between K. fedtschenkoi and A. comosus transcript time-courses was less than or equal to 3 h, whereas the time shifts between CAM species (K. fedtschenkoi and A. comosus) transcripts and their A. thaliana ortholog transcript were equal to or greater than 6 h. Based on these criteria, 54 K. fedtschenkoi genes were identified as candidates for involvement in the convergent shift in diel gene expression patterns specific to the two CAM species relative to A. thaliana (Supplementary Note 2, Supplementary Data 3 and Supplementary Table 8).

To identify genes that had likely undergone convergent evolution in protein sequence in the CAM species, we reconstructed gene tribes based on protein sequences from the species listed in Supplementary Fig. 4. We then created phylogenetic trees for the genes from all tribes that include at least one gene from each of the 13 studied species (Supplementary Table 9). A K. fedtschenkoi gene was defined as having undergone convergent evolution in protein sequence if it met all of the following criteria: (1) the K. fedtschenkoi gene is clustered with gene(s) from at least one of the two monocot CAM species (A. comosus and P. equestris) in a phylogenetic clade containing no genes from C₃ or C₄ photosynthesis species; (2) convergent amino-acid changes were detected between the K. fedtschenkoi gene with gene(s) from at least one of the two monocot CAM species; and (3) the K. fedtschenkoi gene shared at least one amino-acid mutation with its ortholog in at least one of the two monocot CAM species, as compared with C₃ and C₄ photosynthesis species. A total of four K. fedtschenkoi genes showing convergent changes in protein sequences were identified (Supplementary Figs. 12–15 and Supplementary Table 10).

We also performed genome-wide positive selection analysis in each of the three CAM species (i.e., A. comosus, P. equestris, and K. fedtschenkoi) in comparison with 21 non-CAM species (Supplementary Method 1) and identified two genes that were under positive selection in the dicot CAM species K. fedtschenkoi and one of the monocot CAM species (Supplementary Figs. 16–17).

Convergent evolution of genes involved in CO₂ fixation

PEPC is a key enzyme for nocturnal CO₂ fixation and PPCK is a pivotal protein kinase that regulates PEPC in response to the circadian clock in CAM plants^4,6,34. PPCK phosphorylates PEPC in the dark (Fig. 5a) and thereby reduces malate inhibition of PEPC activity, promoting nocturnal CO₂ uptake^35,36. Multiple PPCK genes were identified in the K. fedtschenkoi genome, among which two genes (Kaladp0037s0517 and Kaladp0604s0001) showed higher transcript abundance than the others in CAM leaves (Supplementary Table 5). The diel expression patterns of the most abundant PPCK transcripts in K. fedtschenkoi (Kaladp0037s0517.1) and A. comosus (Aco013938.1) were highly correlated, with only a 1.5-hour time shift between them, whereas both showed an ~11-hour time shift relative to their best matched ortholog in Arabidopsis (AT1G08650) (Fig. 4b and Supplementary Table 8). Peak PPCK transcript abundance was shifted from daytime in C₃ photosynthesis species (Arabidopsis) to nighttime in the two CAM species (Fig. 4b), which suggests convergence and is consistent with PPCK activation of PEPC-mediated nocturnal CO₂ fixation. Among the PEPC genes identified in K. fedtschenkoi, Kaladp0095s0055 and Kaladp0048s0578 showed higher transcript abundance than the others (Supplementary Table 5). Kaladp0095s0055 (named PEPC1 herein) was an abundant transcript throughout both the light and the dark period, with its peak transcript level phased to dusk. The second most abundant PEPC transcript (Kaladp0048s0578, named PEPC2 herein) showed a much higher transcript level during the dark period than during the light period (Fig. 5b). We found that a duplicated pair of K. fedtschenkoi PEPC2 genes (Kaladp0048s0578 and Kaladp0011s0355) clustered together with a PEPC gene (PEQU_07008) from P. equestris (Supplementary Fig. 12). PEQU_07008 was recently reported as the CAM-type PEPC in P. equestris, and, like Kaladp0048s0578, this orchid PEPC gene also showed higher transcript abundance during the dark period than during the light period³⁷.

Convergent changes in PEPC2 protein sequence were found between K. fedtschenkoi and P. equestris (Fig. 6a, b). Specifically, multiple protein sequence alignment revealed that an aspartic acid residue (D509) in Kaladp0048s0578 is conserved in PEQU_07008 and Kaladp0011s0355 (a duplicated copy of Kaladp0048s0578), but there was an arginine (R), lysine (K), or histidine (H) in the corresponding sites of the PEPC protein sequences of other tested species (Fig. 6c and Supplementary Fig. 12). The structural model of the Kaladp0048s0578 protein indicates that this single amino-acid substitution (from a basic amino-acid R/K/H to an acidic amino-acid D) is located in an α-helix adjacent to the active site in a β-barrel (Fig. 7a). We hypothesize that an activator binds to the active site of one subunit of the tetrameric complex of PEPC2, leading to allosteric conformational changes that subsequently activate another subunit of the tetramer (Fig. 7b). This model was supported by a recent crystallography structure of the Flavaria trinervia (a C₄ photosynthesis plant) PEPC with an activator glucose-6-phosphate (G6P) bound at the β-barrel active center³⁸. Based on this model, because D509 of PEPC2 (Kaladp0048s0578) is also negatively charged as G6P, the observed substitution may play a similar role as the activator by triggering allosteric conformational changes that lead to activation of the other subunits of PEPC tetramer. Nimmo³⁹ reported that PEPC is subject to posttranslational regulation in the dark via phosphorylation by PPCK. In vitro analysis of the activities of different heterologously expressed PEPC isoforms showed that without phosphorylation by PPCK, PEPC1 from K. fedtschenkoi had a much lower activity than PEPC2 from either K. fedtschenkoi or P. equestris (Fig. 6d). Further, the R515D mutation significantly (Student’s t-test, P < 0.01) increased the activity of K. fedtschenkoi PEPC1, whereas the D509K and D504K mutations significantly (Student’s t-test, P < 0.01) reduced the activities of K. fedtschenkoi PEPC2 and P. equestris PEPC2, respectively (Fig. 6d). These results indicate that a single amino-acid mutation could significantly modify PEPC activity.

Our evolutionary analyses did not detect convergent evolution in either protein sequence or diel transcription patterns for the various decarboxylation genes that are expressed in Kalanchoë and A. comosus. In Kalanchoë, NAD(P)-ME genes were highly expressed, whereas the expression of the PEPCK gene was very low (Supplementary Fig. 18), consistent with the known high extractable activities of NAD-ME and NADP-ME in CAM leaves of Kalanchoë ^40,41. By contrast, in A. comosus the transcript abundance of PEPCK was much higher than that of malic enzyme (ME) (Supplementary Fig. 18), supporting the model that malate decarboxylation in Kalanchoë is mediated by ME, which was recently substantiated using a transgenic RNAi approach^20,40, whereas in pineapple a combination of MDH, working in the OAA-forming direction, coupled with PEPCK, converting OAA to PEP and CO₂, are the candidate decarboxylation enzymes¹⁸, consistent with previous enzyme activity studies⁸.

Convergent evolution of genes involved in stomatal movement

A unique feature of CAM physiology is the inverted light/dark pattern of stomatal movement relative to C₃ photosynthesis, with stomata opening during the night in CAM and during the day in C₃ photosynthesis plants⁶. Blue light is a key environmental signal that controls stomatal opening and phototropin 2 (PHOT2; AT5G58140), a blue light photoreceptor, mediates blue light regulation of stomatal opening in Arabidopsis ⁴². Twenty genes that could potentially be involved in stomatal movement in K. fedtschenkoi were predicted based on homology to Arabidopsis genes involved in the regulation of stomatal movement (Supplementary Table 11). One of these genes, Kaladp0033s0113, which encodes PHOT2, showed only a 1-h time shift in transcript abundance pattern relative to its A. comosus ortholog (Aco014242) (Supplementary Table 8), possibly indicating a convergent change in the diel pattern of its transcript abundance pattern in the two CAM species. In support of a convergent evolution hypothesis, the transcript abundance patterns of the two PHOT2 genes in the CAM species showed 11- (Kalanchoë) and 9- (pineapple) hour phase shifts, respectively, relative to that of the PHOT2 gene (AT5G58140) in the C₃ photosynthesis species Arabidopsis (Fig. 4c). The timing of peak transcript abundance shifted from dawn in Arabidopsis to dusk in the two CAM species (Fig. 4c). This convergent change in diel transcript abundance pattern suggests that PHOT2 might contribute to the inverted day/night pattern of stomatal closure and opening in CAM species such that PHOT2 might function as a switch mediating the blue-light signal to open the stomata at dusk and the stomata could then remain open during the dark period.

Convergent evolution of genes involved in heat tolerance

The stomata of mature CAM leaves of K. fedtschenkoi close for the majority of the light period⁴⁰, which may exacerbate the internal heat load on the leaves⁴³. Photosynthesis is sensitive to heat stress and can be inhibited long before other symptoms of heat stress are detected⁴⁴. Numerous studies have shown that the inhibition of photosynthesis by moderate heat stress is a consequence of RuBisCO deactivation, caused, in part, by the thermal instability of RuBisCO activase⁴⁵. Heat-shock proteins can play a critical role in the stabilization of proteins under heat stress conditions⁴⁶. Wang et al.⁴⁷ reported that HSP40 (SlCDJ2) contributed to the maintenance of CO₂ assimilation capacity mainly by protecting RuBisCO activity under heat stress and that HSP70 (cpHsp70) acted as a binding partner for SlCDJ2 in tomato. HSP70 can also function as nano-compartments in which single RbcL/RbcS subunits can fold in isolation, unimpaired by aggregation⁴⁸, as illustrated in Fig. 8a. Among the HSP70 genes predicted in K. fedtschenkoi, Kaladp0060s0296 displayed peak transcript abundance in the morning, with only a 1-h shift in diel transcript abundance pattern relative to its A. comosus ortholog Aco031458, whereas these two HSP70 genes in the CAM species showed ~10-h shifts in diel transcript abundance pattern relative to their best-matched A. thaliana ortholog, AT5G02490 (Fig. 8b and Supplementary Table 8), suggesting that HSP70 has undergone convergent changes in diel transcript expression patterns during the evolution of CAM.

Convergent evolution of genes in the circadian clock

Key physiological and biochemical features of CAM including net CO₂ exchange and PEPC phosphorylation are well established as outputs of the circadian clock, displaying robust oscillation under free-running constant conditions^20,40. Thus, the circadian clock could be a key regulator of the diel reprogramming of metabolism and stomatal function that defines CAM. The molecular basis of circadian rhythms has been studied extensively in non-CAM species³³. Based on homology to Arabidopsis genes that have been shown to play important roles as molecular components of the circadian clock, 35 K. fedtschenkoi genes were predicted to be involved in circadian rhythms (Supplementary Table 12). None of these K. fedtschenkoi genes are among the list of genes showing convergent changes in diel expression pattern (Supplementary Data 3), suggesting that CAM evolution did not involve major changes in the diel expression pattern of these known circadian rhythm genes shared between Arabidopsis and K. fedtschenkoi. However, we cannot rule out the possibility of convergent evolution in unknown circadian rhythm genes between these two species. Also, it is possible that genes that are not involved in circadian rhythms in Arabidopsis could have taken on this function in K. fedtschenkoi. On the other hand, Kaladp0060s0460, which encodes ELONGATED HYPOCOTYL5 (HY5), showed a convergent change in protein sequences between K. fedtschenkoi and P. equestris (Supplementary Table 10). HY5 is a bZIP family transcription factor in the blue light signaling pathway that acts as an input to entrain the circadian clock³³ (Fig. 9a). A single amino-acid mutation (E-to-R) occurred in the C-terminal bZIP domains of the proteins encoded by Kaladp0060s0460 and its P. equestris ortholog PEQU_13446 as compared with HY5 from C₃ or C₄ photosynthesis species (Fig. 9b and Supplementary Fig. 14). The bZIP domain determines the DNA-binding ability of HY5 as a transcription factor⁴⁹, mediating the interaction between HY5 and G-BOX BINDING FACTOR 1⁵⁰. HY5 has been shown to move from shoot to root to coordinate aboveground plant carbon uptake in the leaf and belowground nitrogen acquisition in the root⁵¹. Therefore, the potential roles of HY5, Kaladp0060s0460, in circadian rhythmicity and shoot-to-root communication in K. fedtschenkoi needs to be investigated using experimental approaches such as loss-of-function mutagenesis⁵².

Convergent evolution of genes in carbohydrate metabolism

Nocturnal production of phosphoenolpyruvate (PEP) as a substrate for dark CO₂ uptake represents a substantial sink for carbohydrates in CAM plants, which has to be balanced with the provision of carbohydrates for growth and maintenance⁵³. Carbohydrate active enzymes (CAZymes) play critical roles in regulating carbohydrate synthesis, metabolism, and transport in living organisms. There are six CAZyme classes: glycoside hydrolases (GHs), glycosyltransferases (GTs), polysaccharide lyases, carbohydrate esterases, auxiliary activities, and carbohydrate-binding modules. Each of these classes contains from a dozen to over one hundred different protein families based on sequence similarity⁵⁴. The six classes of CAZymes have different functions. For example, GH enzymes catalyze the hydrolysis of glycosidic bonds, while GT enzymes catalyze the formation of glycosidic bonds. Using CAZyme domain-specific hidden Markov models, defined in the dbCAN database⁵⁵, we identified 100 CAZyme families, including 1093 genes in the K. fedtschenkoi genome, comparable to the total number (1149) of CAZyme genes in A. thaliana (Supplementary Data 4 and 5). Among these CAZyme genes, four ortholog groups (ORTHOMCL68, ORTHOMCL93, ORTHOMCL207, and ORTHOMCL9830) of genes (e.g., Kaladp0550s0020, Kaladp0011s0363, Kaladp0037s0421, Kaladp0055s0317, respectively) belonging to the CAZyme families GH100, GT20, GT2, and GT5, respectively, displayed convergent changes in their patterns of diel transcript abundance in two CAM species (K. fedtschenkoi and A. comosus) compared with the C₃ photosynthesis species (A. thaliana) (Supplementary Data 3). Specifically, the K. fedtschenkoi CAZyme genes with convergent changes in diel transcript abundance pattern (e.g., Kaladp0550s0020 [GH100], Kaladp0011s0363 [GH20], Kaladp0037s0421 [GT2], and Kaladp0055s0317 [GT5]) showed higher transcript abundance in the dark and early light period (Supplementary Fig. 19). In particular, two genes (Kaladp0011s0363 and Kaladp0055s0317) were predicted to be involved in starch and sucrose metabolism (Supplementary Fig. 20). Kaladp0011s0363 encodes a probable trehalose phosphate synthase. Trehalose 6-phosphate is an important sugar signaling metabolite and is thought to link starch degradation to demand for sucrose and growth⁵⁶. Kaladp0550s0020 encodes an alkaline-neutral invertase that catalyzes the hydrolysis of sucrose to glucose and fructose. This invertase has also been implicated in metabolic signaling processes as an important regulator of plant growth and development⁵⁷. Taken together, these data suggest that the evolution of CAM from C₃ photosynthesis requires re-scheduling of the transcription of metabolic and signaling genes that regulate the partitioning of carbohydrates between reserves that provide substrates for CAM and carbohydrates required for growth.

In addition to the above convergent changes in expression pattern of four CAZyme genes, we also identified convergent changes in protein sequences of another two CAZyme genes (Kaladp0016s0058 [GT29] and Kaladp0067s0114 [GH35]) that were under positive selection (CodeML implemented in PosiGene⁵⁸, P < 0.05) in the dicot CAM species K. fedtschenkoi and one of the two monocot CAM species (A. comosus and P. equestris) (Supplementary Figs. 16–17). Kaladp0016s0058 encodes a putative sialyltransferase-like protein. Two single amino-acid mutations were found in Kaladp0016s0058 and its A. comosus ortholog Aco018360, as compared with the orthologous protein sequences of non-CAM species (Supplementary Fig. 16). These two mutations are close to each other (i.e., within a four-amino-acid distance), suggesting the possibility that the two mutations affect the same functional domain. Kaladp0067s0114 encodes a beta-galactosidase protein that hydrolyses the glycosidic bond between two or more carbohydrates. Two single amino-acid mutations were identified in Kaladp0067s0114 and its P. equestris ortholog PEQU_04899, as compared with the orthologous protein sequences of non-CAM species (Supplementary Fig. 17). These two mutations are close to each other (i.e., within an 11-amino-acid distance) in the middle of galactose-binding domain (Supplementary Fig. 17), which can bind to specific ligands and carbohydrate substrates for enzymatic catalytic reactions⁵⁹. The relevance of these convergent changes in protein sequence to CAM evolution needs further investigation.

Discussion

The CAM pathway has been found in 36 families of vascular plants⁴, among which Crassulaceae plays a unique role in CAM research because the pathway was first discovered in this succulent plant family and was thus named⁶⁰. Within Crassulaceae, the genus Kalanchoë has been the most widely used for CAM research. As a model species for research into the molecular biology and functional genomics of CAM, K. fedtschenkoi stands out due to its relatively small genome, low repetitive sequence content, and efficient stable transformation protocols²⁰. The genome sequence presented in this study renders K. fedtschenkoi as a new model for plant evolutionary and comparative genomics research, both for CAM photosynthesis and beyond. Although this study focused on genome-wide analysis of convergent evolution in CAM plants, the K. fedtschenkoi genome data can be used to facilitate CAM research related to: (1) generating loss-of-function mutants for functional characterization of CAM-related genes using genome-editing technology; (2) deciphering the regulation of CAM genes through identification of transcription factors and promoters of their target genes; (3) analyzing CAM gene expression by serving as a template for mapping of RNA sequencing reads and protein mass spectrometry data; and (4) identifying DNA polymorphisms related to genetic diversity of plants in the genus Kalanchoë.

Our genome-wide comparison of CAM species and non-CAM species revealed two types of convergent changes that could be informative with respect to the evolution of CAM: protein sequence convergence and convergent changes in the diel re-scheduling of transcript abundance. In the present study, a total of 60 genes exhibited convergent evolution in divergent eudicot and monocot CAM lineages. Specifically, we identified protein sequence convergence in six genes involved in nocturnal CO₂ fixation, circadian rhythm, carbohydrate metabolism, and so on (Supplementary Table 10 and Supplementary Figs. 16–17). Also, we identified convergent diel expression changes in 54 genes that are involved in stomatal movement, heat stress response, carbohydrate metabolism, and so on (Supplementary Data 3). These results provide strong support for our hypothesis that convergent evolution in protein sequence or gene temporal expression underpins the multiple and independent emergences of CAM from C₃ photosynthesis. New systems biology tools and genome-editing technologies^52,61 offer great potential for plant functional genomics research based on loss- or gain-of-function mutants to characterize the role of the genes predicted here to have undergone convergent evolution.

Convergent gene function can arise by (1) a mutation or mutations in the same gene or genes that result in homoplasy in organisms or (2) independent causal mutation or mutations in different genes in each lineage^10,62. We identified four genes that showed convergent changes in protein sequences, none of which were shared by the three CAM species A. comosus, K. fedtschenkoi, and P. equestris (Supplementary Table 10 and Supplementary Figs. 12–15), suggesting that CAM convergences result mainly from the second scenario. Alternatively, K. fedtschenkoi shares the convergent mutation in the PEPC2 protein sequence with P. equestris (Fig. 6), whereas it shares the convergent change in the pattern of diel transcript abundance of PPCK1 with A. comosus (Fig. 4b). These results suggest that two alternative modes of convergent evolution could have occurred in pathways for nocturnal CO₂ fixation. First, PPCK shifted from light period to dark period to promote the activation of PEPC1 (the most abundant isoform), as exemplified by K. fedtschenkoi and A. comosus. Second, a single amino-acid mutation from R/K/H to D to maintain the active state of PEPC2, without the need for phosphorylation, then occurred, as in K. fedtschenkoi and P. equestris.

According to the constrained selection theory of Morris¹¹, we expected to see convergent changes in protein sequences in all the three CAM species. However, in this study, single-site mutations were found in only two of the three CAM species. Our additional positive selection analysis revealed that Kalanchoë did share convergent sequence mutation with the other two CAM species, but at alternate sites (Supplementary Figs. 16–17). This is consistent with a recent report showing that single amino-acid mutations were not shared by all the bird species that displayed convergent evolution of hemoglobin function as an adaptation to high-altitude environments¹⁴. Alternatively, our results, to some extent, support the contingent adaptation theory of Gould¹⁵. The relevance of these predicted convergent changes to CAM needs to be investigated using experimental approaches, such as transferring the convergent CAM genes to C₃ photosynthesis species to test the effect of these genes on C₃-to-CAM photosynthesis transition.

In this study, we did not identify any gene that exhibited both convergent changes in transcript abundance patterns (Supplementary Data 3) and convergent changes in protein sequence (Supplementary Table 10 and Supplementary Figs. 16–17), suggesting that convergent evolution of a gene in CAM species was achieved through either protein sequence convergence or rewiring of gene expression. Indeed, we have not seen any reports showing that both protein convergence and convergent gene expression change occurred in the same gene in any type of organisms to date. Thus, we can hypothesize that convergent evolution follows the “law of parsimony” that emphasizes the fewest possible assumptions for explaining a thing or event⁶³. An implication of this hypothesis is that reuse of the key genes via altered diel expression patterns would be the shortest path for C₃-to-CAM photosynthesis evolution; and on the other hand, mutations in some key sites of protein sequences, while keeping the temporal gene expression pattern unchanged, would be the shortest path for evolving new protein function required by CAM. Although our data fit this hypothesis, additional screens for genes that have convergent changes in both protein sequence and expression pattern in the future are merited.

Increasing human population and changes in global temperature and precipitation are creating major challenges for the sustainable supply of food, fiber, and fuel in the years to come. As a proven mechanism for increasing WUE in plants, CAM offers great potential for meeting these challenges. Engineering of CAM-into-C₃ photosynthesis plants could be a viable strategy to improve WUE in non-CAM crops for food and biomass production^4,6. The genes predicted here to have undergone convergent evolution during the emergence of CAM are crucial candidates for CAM-into-C₃ photosynthesis engineering. Our results suggest that CAM-into-C₃ photosynthesis engineering requires rewiring of the diel transcript abundance patterns for most of the candidate genes in the target C₃ photosynthesis species, along with amino-acid mutations in the protein sequences of several other candidate genes. Specifically, CAM-into-C₃ photosynthesis engineering efforts should be focused on changing the temporal patterns of transcript expression of endogenous genes in the target C₃ photosynthesis species corresponding to the K. fedtschenkoi genes listed in Supplementary Data 3. CRISPR/Cas9-based knock-in approach⁵² can be used to replace the original endogenous promoters of the target genes with temporal promoters that confer temporal expression patterns similar to those of their orthologous genes in the CAM species. For example, dark-inducible promoters such as Din10⁶⁴ can be used to drive the expression of carboxylation gene modules during the nighttime and light-inducible promoters, such as GT1-GATA-NOS101⁶⁵, can be used to drive the expression of decarboxylation gene modules during the daytime. To make the protein sequence changes needed for CAM-into-C₃ photosynthesis engineering, transferring the K. fedtschenkoi genes listed in Supplementary Table 10 to target C₃ photosynthesis species via the Agrobacterium-mediated transformation could provide a relatively straightforward path to an efficient engineered CAM pathway. Alternatively, one could mutate the amino acids shown in Supplementary Figs. 12–15 using a knock-in strategy with emerging genome-editing technology⁵².

In summary, this study provides an important model genome for studying plant comparative, functional, and evolutionary genomics, as well as significant advances in our understanding of CAM evolution. Our findings hold tremendous potential to accelerate the genetic improvement of crops for enhanced drought avoidance and sustainable production of food and bioenergy on marginal lands.

Methods

Plant material

Kalanchoë fedtschenkoi ‘M2’ plants were purchased from Mass Spectrum Botanicals (Tampa, FL, USA) (Supplementary Method 2).

Estimation of DNA content

The DNA contents of young leaf tissue samples were analyzed using flow cytometry analysis service provided by Plant Cytometry Services (The Netherlands). The internal standard was Vinca minor (DNA = 1.51 pg/2 C = 1477 Mbp/2 C).

Chromosome counting

Images were collected using an Olympus FluoView FV1000 confocal microscope (Center Valley, PA, USA) with a 60× objective. Images were sharpened using Adobe Photoshop and chromosomes were counted using ImageJ software (Supplementary Method 3).

Illumina sequencing of genome

The genomic DNA libraries of K. fedtschenkoi were sequenced on a MiSeq instrument (Illumina, CA, USA) using MiSeq Reagent Kit v3 (600-cycle) (Illumina, CA, USA) (Supplementary Method 4).

Transcriptome sequencing

In order to capture mRNA abundance changes responsive to diel conditions, samples were collected in triplicate from mature K. fedtschenkoi leaves (i.e., the fifth and sixth mature leaf pairs counting from the top) every 2 h over a 24 h time course under 12 h light/12 h dark photoperiod. Additional tissues were sampled in triplicate including roots, flowers, shoot tips plus young leaves, and stems at one time point, 4 h after the beginning of the light period (Supplementary Method 5).

Genome assembly and improvement

The K. fedtschenkoi genome was initially assembled using platanus⁶⁶ from 70X Illumina paired-end reads (2 × 300 bp reads; unamplified 540 bp whole-genome shotgun fragment library), and three mate-libraries (3 kb, 14X; 6 kb, 12X; 11 kb, 11X). Further genome scaffolding was performed using MeDuSa⁶⁷ sequentially with the genome assemblies of K. laxiflora v1.1 (Phytozome), Vitis vinifera Genoscope.12X (Phytozome), and Solanum tuberosum v3.4 (Phytozome).

Protein-coding gene annotation

The genome annotation for K. fedtschenkoi was performed using homology-based predictors facilitated with transcript assemblies (Supplementary Method 6).

Construction of orthologous groups

The protein sequences of 26 plant species were selected for ortholog group construction (Supplementary Method 7).

Construction of species phylogeny

The phylogeny of plant species was constructed from the protein sequences of 210 single-copy genes identified through analysis of orthologous groups (see “Construction of orthologous groups” section). The details for species phylogeny construction are described in Supplementary Method 8.

Construction of protein tribes and phylogenetic analysis

The protein sequences used for ortholog analysis (see “Construction of orthologous groups”) were also clustered into tribes using TRIBE-MCL⁶⁸, with a BLASTp E-value cutoff of 1e-5 and an inflation value of 5.0. Phylogenetic analysis of the protein tribes is described in Supplementary Method 9.

Analysis of convergence in protein sequences in CAM species

The phylogenetic trees of protein tribes (see aforementioned “Construction of protein tribes and phylogenetic analysis”) were examined to identify the “CAM-convergence” clade, which was defined to contain genes from K. fedtschenkoi (dicot) and at least one of the two monocot CAM species (A. comosus and P. equestris) without any genes from C₃ or C₄ species. The rationale for defining the “CAM-convergence” clade is that the dicot CAM species K. fedtschenkoi should be separated from the monocot CAM species if there is no convergence between Kalanchoë and the monocot CAM species (Supplementary Method 10).

Gene Ontology analysis and pathway annotation

Whole-genome gene ontology (GO) term annotation was performed using BLAST2GO^69,70 with a BLASTP E-value hit filter of 1 × 10⁻⁶, an annotation cutoff value of 55, and GO weight of 5. The enrichment of GO biological process and pathway annotation are described in Supplementary Method 11.

Analysis of carbohydrate active enzymes

The protein sequences were searched against the dbCAN database⁵⁵ using HMMER3 (http://hmmer.org/). The HMMER search outputs were parsed to keep significant hits with E-value <1e-23 (calculated by HMMER) and coverage >0.2 (calculated on the HMM, which is equal to (end position - start position)/total length of HMM), as suggested by a large scale benchmark analysis⁷¹.

Estimation of transcript abundance in Kalanchoë

The RNA-seq data in fastq format were mapped to the K. fedtschenkoi genome using TopHat2⁷². Transcript abundance in FPKM (Fragments Per Kilobase of transcript per Million mapped reads) was estimated using Cufflinks⁷³. All mapped read counts of the transcripts were counted by using htseq-count, a subprogram of HTseq⁷⁴.

Co-expression network analysis in Kalanchoë

The expression data of 16 samples in triplicates were used for co-expression network analysis, which included time-course data (12 time points: 2, 4, 6, …, 24 h after the beginning of the light period) from mature leaf and one time point data (4 h after the beginning of the light period) from roots, flowers, stems, and shoot tips plus young leaves collected in triplicate from the K. fedtschenkoi plants grown under 12 h light/12 h dark photoperiod. The details for co-expression network analysis are described in Supplementary Method 12.

Cluster analysis of gene expression in Kalanchoë

Count values for each RNA-seq library were used to calculate polynomial regressions across time (Supplementary Method 13).

Comparative analysis of gene expression

The diurnal expression data with 4-h intervals for Arabidopsis thaliana were obtained from Mockler et al.⁷⁵ and adjusted to 2-h interval time series by interpolation using the SRS1 cubic spline function (http://www.srs1software.com/). The diurnal expression data with 2-h intervals for K. fedtschenkoi was generated in this study. The diurnal expression data with 2-h intervals for Ananas comosus was obtained from Ming et al.⁴. The gene expression data were normalized by Z-score transformation. The hierarchical clustering of gene expression was performed for genes in each ortholog group using the Bioinformatics Toolbox in Matlab (Mathworks, Inc.) based on Spearman correlation (Supplementary Method 14).

Genome synteny analysis

Pairwise genome alignments were performed between grape genome (Genoscope.12X; https://phytozome.jgi.doe.gov) and K. fedtschenkoi (Supplementary Method 15).

Protein 3D structural simulation

The protein structural models were built using the iterative threading assembly refinement (I-TASSER, V4.3) structural modeling toolkit^76,77.

Gas chromatography-mass spectrometry metabolite profiling

For the major metabolites of K. fedtschenkoi, a total of 36 leaf samples (the 5th and 6th fully expanded leaf pairs counting from the top) were collected with three biological replicates sampled every 2-h for a 24-h diurnal cycle. Additionally, three biological replicate samples of stems, roots, shoot tips plus young leaves, and flowers were also collected (Supplementary Method 16).

In vitro protein expression and analysis of enzyme activity

The PEPC proteins were expressed in bacterial BL21strains (Novagen BL21 (DE3) pLysS Singles), and purified via Glutathione Sepharose 4B beads (GE Healthcare Life Sciences, Pittsburgh, PA, USA). The protein quality was checked via western blot using anti-PEPC antibody (Agrisera, Sweden) and the PEPC activity was determined (Supplementary Method 17).

Data availability

The Department of Energy (DOE) will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). The K. fedtschenkoi genome sequence and annotation are deposited in Phytozome (https://phytozome.jgi.doe.gov). The K. fedtschenkoi genome sequence is also deposited at NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank/) under the accession code NQLW00000000. The genome sequencing reads are deposited in NCBI Sequence Read Archive (SRA) (https://www.ncbi.nlm.nih.gov/sra) with the BioSample accession SAMN07509503, which is the combination of the five individual BioSamples: SAMN07453935, SAMN07453936, SAMN07453937, SAMN07453938, and SAMN07453939. The RNA-Seq reads are deposited in NCBI SRA with the BioSample accession codes SAMN07453940 - SAMN07453987. The metabolite data is deposited at MetaboLights (http://www.ebi.ac.uk/metabolights/) under the accession code MTBLS519.

References

West-Eberhard, M., Smith, J. & Winter, K. Photosynthesis, reorganized. Science 332, 311–312 (2011).
Article ADS CAS PubMed Google Scholar
Borland, A. M., Griffiths, H., Hartwell, J. & Smith, J. A. C. Exploiting the potential of plants with crassulacean acid metabolism for bioenergy production on marginal lands. J. Exp. Bot. 60, 2879–2896 (2009).
Article CAS PubMed Google Scholar
Cushman, J. C., Davis, S. C., Yang, X. & Borland, A. M. Development and use of bioenergy feedstocks for semi-arid and arid lands. J. Exp. Bot. 66, 4177–4193 (2015).
Article CAS PubMed Google Scholar
Yang, X. et al. A roadmap for research on crassulacean acid metabolism (CAM) to enhance sustainable food and bioenergy production in a hotter, drier world. New Phytol. 207, 491–504 (2015).
Article CAS PubMed Google Scholar
Owen, N. A. & Griffiths, H. A system dynamics model integrating physiology and biochemical regulation predicts extent of crassulacean acid metabolism (CAM) phases. New Phytol. 200, 1116–1131 (2013).
Article CAS PubMed Google Scholar
Borland, A. M. et al. Engineering crassulacean acid metabolism to improve water-use efficiency. Trends Plant Sci. 19, 327–338 (2014).
Article CAS PubMed PubMed Central Google Scholar
Silvera, K. et al. Evolution along the crassulacan acid metabolism continuum. Funct. Plant Biol. 37, 995–1010 (2010).
Article CAS Google Scholar
Christopher, J. & Holtum, J. Patterns of carbohydrate partitioning in the leaves of crassulacean acid metabolism species during deacidification. Plant Physiol. 112, 393–399 (1996).
Article CAS PubMed PubMed Central Google Scholar
Holtum, J. A. M., Smith, J. A. C. & Neuhaus, H. E. Intracellular transport and pathways of carbon flow in plants with crassulacean acid metabolism. Funct. Plant Biol. 32, 429–449 (2005).
Article CAS Google Scholar
Washburn, J. D., Bird, K. A., Conant, G. C., Pires, J. C. & Herendeen, P. S. Convergent evolution and the origin of complex phenotypes in the age of systems biology. Int. J. Plant Sci. 177, 305–318 (2016).
Article Google Scholar
Morris S. C. Life’s Solution: Inevitable Humans In A Lonely Universe. Cambridge University (2003).
Foote, A. D. et al. Convergent evolution of the genomes of marine mammals. Nat. Genet. 47, 272–275 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hu, Y. B. et al. Comparative genomics reveals convergent evolution between the bamboo-eating giant and red pandas. Proc. Natl Acad. Sci. USA 114, 1081–1086 (2017).
Article CAS PubMed PubMed Central Google Scholar
Natarajan, C. et al. Predictable convergence in hemoglobin function has unpredictable molecular underpinnings. Science 354, 336–339 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Gould S. J. Wonderful Life: The Burgess Shale And The Nature Of Life. Norton (1989).
Pfenning, A. R. et al. Convergent transcriptional specializations in the brains of humans and song-learning birds. Science 346, 1256846 (2014).
Article PubMed PubMed Central Google Scholar
Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).
Article PubMed Google Scholar
Ming, R. et al. The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet. 47, 1435–1442 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cai, J. et al. The genome sequence of the orchid Phalaenopsis equestris. Nat. Genet. 47, 65–72 (2015).
Article CAS PubMed Google Scholar
Hartwell, J., Dever, L. V. & Boxall, S. F. Emerging model systems for functional genomics analysis of crassulacean acid metabolism. Curr. Opin. Plant Biol. 31, 100–108 (2016).
Article CAS PubMed Google Scholar
Soltis, D. E., Soltis, P. S., Endress, P. K., & Chase, M. W. Phylogeny And Evolution Of Angiosperms. Sinauer Associates Inc. (2005).
Soltis, D. E. et al. Phylogenetic relationships and character evolution analysis of Saxifragales using a supermatrix approach. Am. J. Bot. 100, 916–929 (2013).
Article PubMed Google Scholar
The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Bot. J. Linn. Soc. 181, 1–20 (2016).
Article Google Scholar
Zeng, L. et al. Resolution of deep eudicot phylogeny and their temporal diversification using nuclear genes from transcriptomic and genomic datasets. New Phytol. 214, 1338–1354 (2017).
Article CAS PubMed Google Scholar
Moore, M. J., Soltis, P. S., Bell, C. D., Burleigh, J. G. & Soltis, D. E. Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc. Natl Acad. Sci. USA 107, 4623–4628 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Maddison, W. P. & Knowles, L. L. Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55, 21–30 (2006).
Article PubMed Google Scholar
Degnan, J. H. & Rosenberg, N. A. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340 (2009).
Article PubMed Google Scholar
Murat, F. et al. Karyotype and gene order evolution from reconstructed extinct ancestors highlight contrasts in genome plasticity of modern rosid crops. Genome Biol. Evol. 7, 735–749 (2015).
Article CAS PubMed PubMed Central Google Scholar
Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007).
Article ADS CAS PubMed Google Scholar
Paterson, A. H. et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492, 423–427 (2012).
Article ADS CAS PubMed Google Scholar
Amborella Genome Project.. The Amborella genome and the evolution of flowering plants. Science 342, 1241089 (2013).
Article Google Scholar
Ledger, S., Strayer, C., Ashton, F., Kay, S. A. & Putterill, J. Analysis of the function of two circadian-regulated CONSTANS-LIKE genes. Plant J. 26, 15–22 (2001).
Article CAS PubMed Google Scholar
Hsu, P. Y. & Harmer, S. L. Wheels within wheels: the plant circadian system. Trends Plant Sci. 19, 240–249 (2014).
Article CAS PubMed Google Scholar
Hartwell J. The Circadian Clock in CAM Plants. in Annual Plant Reviews: Endogenous Plant Rhythms (ed Hall A. J. W., McWatters H.). Blackwell Publishing (2006).
Hartwell, J., Nimmo, G., Wilkins, M., Jenkins, G. & Nimmo, H. Phosphoenolpyruvate carboxylase kinase is a novel protein kinase is a novel protein kinase regulated at the level of gene expression. Plant J. 20, 333–342 (1999).
Article CAS PubMed Google Scholar
Taybi, T., Patil, S., Chollet, R. & Cushman, J. A minimal Ser/Thr protein kinase circadianly regulates phosphoenolpyruvate carboxylase activity in CAM-induced leaves of Mesembryanthemum crystallinum. Plant Physiol. 123, 1471–1482 (2000).
Article CAS PubMed PubMed Central Google Scholar
Zhang, L. et al. Origin and mechanism of crassulacean acid metabolism in orchids as implied by comparative transcriptomics and genomics of the carbon fixation pathway. Plant J. 86, 175–185 (2016).
Article CAS PubMed Google Scholar
Schlieper, D., Förster, K., Paulus, J. K. & Groth, G. Resolving the activation site of positive regulators in plant phosphoenolpyruvate carboxylase. Mol. Plant 7, 437–440 (2014).
Article CAS PubMed Google Scholar
Nimmo, H. G. The regulation of phosphoenolpyruvate carboxylase in CAM plants. Trends Plant Sci. 5, 75–80 (2000).
Article CAS PubMed Google Scholar
Dever, L. V., Boxall, S. F., Kneřová, J. & Hartwell, J. Transgenic perturbation of the decarboxylation phase of crassulacean acid metabolism alters physiology and metabolism but has only a small effect on growth. Plant Physiol. 167, 44–59 (2015).
Article CAS PubMed Google Scholar
Dittrich, P. Nicotinamide adenine dinucleotide-specific “malic” enzyme in Kalanchoë daigremontiana and other plants exhibiting crassulacean acid metabolism. Plant Physiol. 57, 310–314 (1976).
Article CAS PubMed PubMed Central Google Scholar
Kinoshita, T. et al. Phot1 and phot2 mediate blue light regulation of stomatal opening. Nature 414, 656–660 (2001).
Article ADS CAS PubMed Google Scholar
Krause, G. H., Winter, K., Krause, B. & Virgo, A. Protection by light against heat stress in leaves of tropical crassulacean acid metabolism plants containing high acid levels. Funct. Plant Biol. 43, 1061–1069 (2016).
Article CAS Google Scholar
Berry, J. & Björkman, O. Photosynthetic response and adaptation to temperature in higher plants. Annu. Rev. Plant Physiol. 31, 491–543 (1980).
Article Google Scholar
Salvucci, M. E. & Crafts-Brandner, S. J. Mechanism for deactivation of Rubisco under moderate heat stress. Physiol. Plant. 122, 513–519 (2004).
Article CAS Google Scholar
Wang, W., Vinocur, B., Shoseyov, O. & Altman, A. Role of plant heat-shock proteins and molecular chaperones in the abiotic stress response. Trends Plant Sci. 9, 244–252 (2004).
Article CAS PubMed Google Scholar
Wang, G. et al. A tomato chloroplast-targeted DnaJ protein protects Rubisco activity under heat stress. J. Exp. Bot. 66, 3027–3240 (2015).
Article CAS PubMed Google Scholar
Liu, C. et al. Coupled chaperone action in folding and assembly of hexadecameric Rubisco. Nature 463, 197–202 (2010).
Article ADS CAS PubMed Google Scholar
Nijhawan, A., Jain, M., Tyagi, A. K. & Khurana, J. P. Genomic survey and gene expression analysis of the basic leucine zipper transcription factor family in rice. Plant Physiol. 146, 333–350 (2008).
Article CAS PubMed PubMed Central Google Scholar
Ram, H. & Chattopadhyay, S. Molecular interaction of bZIP domains of GBF1, HY5 and HYH in Arabidopsis seedling development. Plant Signal. Behav. 8, e22703 (2013).
Article PubMed Google Scholar
Chen, X. et al. Shoot-to-root mobile transcription factor HY5 coordinates plant carbon and nitrogen acquisition. Curr. Biol. 26, 640–646 (2016).
Article CAS PubMed Google Scholar
Liu, D., Hu, R., Palla, K. J., Tuskan, G. A. & Yang, X. Advances and perspectives on the use of CRISPR/Cas9 systems in plant genomics research. Curr. Opin. Plant Biol. 30, 70–77 (2016).
Article CAS PubMed Google Scholar
Borland, A. M., Guo, H.-B., Yang, X. & Cushman, J. C. Orchestration of carbohydrate processing for crassulacean acid metabolism. Curr. Opin. Plant Biol. 31, 118–124 (2016).
Article CAS PubMed Google Scholar
Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490–D495 (2014).
Article CAS PubMed Google Scholar
Yin, Y. et al. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 40, W445–W451 (2012).
Article CAS PubMed PubMed Central Google Scholar
Martins, M. C. M. et al. Feedback inhibition of starch degradation in Arabidopsis leaves mediated by trehalose 6-phosphate. Plant Physiol. 163, 1142–1163 (2013).
Article CAS PubMed PubMed Central Google Scholar
Xiang, L. et al. Exploring the neutral invertase–oxidative stress defence connection in Arabidopsis thaliana. J. Exp. Bot. 62, 3849–3862 (2011).
Article CAS PubMed PubMed Central Google Scholar
Sahm A., Bens M., Platzer M., Szafranski K. PosiGene: automated and easy-to-use pipeline for genome-wide detection of positively selected genes. Nucleic Acids Res. 45, e100, (2017).
Ito, N. & Phillips, S. E. Novel thioether bond revealed by a 1.7 A crystal structure of galactose oxidase. Nature 350, 87 (1991).
Article ADS CAS PubMed Google Scholar
Black, C. C. & Osmond, C. B. Crassulacean acid metabolism photosynthesis: ‘working the night shift’. Photosynth. Res. 76, 329–341 (2003).
Article CAS PubMed Google Scholar
De Paoli, H. C., Tuskan, G. A. & Yang, X. H. An innovative platform for quick and flexible joining of assorted DNA fragments. Sci. Rep. 6, 19278 (2016).
Article ADS PubMed PubMed Central Google Scholar
Wake, D. B., Wake, M. H. & Specht, C. D. Homoplasy: from detecting pattern to determining process and mechanism of evolution. Science 331, 1032–1035 (2011).
Article ADS CAS PubMed Google Scholar
Laird, J. The law of parsimony. The Monist 29, 321–344 (1919).
Article Google Scholar
Fujiki, Y. et al. Dark‐inducible genes from Arabidopsis thaliana are associated with leaf senescence and repressed by sugars. Physiol. Plant. 111, 345–352 (2001).
Article CAS PubMed Google Scholar
Puente, P., Wei, N. & Deng, X. W. Combinatorial interplay of promoter elements constitutes the minimal determinants for light and developmental control of gene expression in Arabidopsis. EMBO J. 15, 3732 (1996).
CAS PubMed PubMed Central Google Scholar
Kajitani, R. et al. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Genome Res. 24, 1384–1395 (2014).
Article CAS PubMed PubMed Central Google Scholar
Bosi, E. et al. MeDuSa: a multi-draft based scaffolder. Bioinformatics 31, (2443–2451 (2015).
Google Scholar
Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).
Article CAS PubMed PubMed Central Google Scholar
Gotz, S. et al. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 36, 3420–3435 (2008).
Article CAS PubMed PubMed Central Google Scholar
Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676 (2005).
Article CAS PubMed Google Scholar
Ekstrom, A., Taujale, R., McGinn, N. & Yin, Y. PlantCAZyme: a database for plant carbohydrate-active enzymes. Database (Oxford) 2014, bau079 (2014).
Article Google Scholar
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, 1 (2013).
Article Google Scholar
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Article CAS PubMed PubMed Central Google Scholar
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Article CAS PubMed Google Scholar
Mockler, T. C. et al. The Diurnal project: diurnal and circadian expression profiling, model-based pattern matching, and promoter analysis. Cold Spring Harb. Symp. Quant. Biol. 72, 353–363 (2007).
Article CAS PubMed Google Scholar
Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725–738 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. The I-TASSER Suite: protein structure and function prediction. Nat. Methods 12, 7–8 (2015).
Article CAS PubMed PubMed Central Google Scholar
Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).
Article CAS PubMed Google Scholar
Mirarab, S. & Warnow, T. ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics 31, i44–i52 (2015).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. This research was supported by the U.S. Department of Energy, Office of Science, Genomic Science Program under Award Number DE-SC0008834. Additional support was provided by the UK Biotechnology and Biological Sciences Research Council (grant no. BB/F009313/1) and the Laboratory Directed Research and Development (LDRD) Program (Project ID: 7758) of Oak Ridge National Laboratory. The work conducted by the U.S. Department of Energy Joint Genome Institute is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02–05CH11231. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory. This research also used the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory. We thank Daniel Rokhsar, Mary Ann Cushman, and Lee Gunter for critical review and comments on the manuscript and Lori Kunder (Kunder Design Studio) for assistance with figure preparation. Oak Ridge National Laboratory is managed by UT-Battelle, LLC for the U.S. Department of Energy under Contract Number DE-AC05-00OR22725.

Author information

Authors and Affiliations

Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
Xiaohan Yang, Rongbin Hu, Hengfu Yin, Degao Liu, Deborah A. Weighill, Robert C. Moseley, Sara Jawdy, Zhihao Zhang, Meng Xie, Ritesh Mewalal, Kaitlin J. Palla, Henrique Cestari De Paoli, Anne M. Borland, Jin-Gui Chen, Wellington Muchero, Daniel A. Jacobson, Timothy J. Tschaplinski & Gerald A. Tuskan
The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, TN, 37996, USA
Xiaohan Yang, Deborah A. Weighill, Robert C. Moseley, Kaitlin J. Palla & Daniel A. Jacobson
HudsonAlpha Institute for Biotechnology, 601 Genome Way, Huntsville, AL, 35801, USA
Jerry Jenkins, Jane Grimwood & Jeremy Schmutz
US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
Shengqiang Shu, David M. Goodstein & Jeremy Schmutz
Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
Haibao Tang & Ray Ming
Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, 89557, USA
Won Cheol Yim, Jungmin Ha, Rebecca Albion, Travis Garcia, Jesse A. Mayer, Sung Don Lim & John C. Cushman
Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
Karolina Heyduk & James H. Leebens-Mack
Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN, 37996, USA
Hao-Bo Guo & Hong Guo
Department of Biological Sciences, Northern Illinois University, DeKalb, IL, 60115, USA
Elisabeth Fitzek & Yanbin Yin
Department of Plant Sciences, Institute of Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
James Hartwell, Susanna F. Boxall & Louisa V. Dever
Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
Paul E. Abraham & Robert L. Hettich
Department of Plant Sciences, University of Oxford, Oxford, OX1 3RB, UK
Juan D. Beltrán & J. Andrew C. Smith
Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
Ching Man Wai & Ray Ming
Pacific Biosciences, Inc., 940 Hamilton Avenue, Menlo Park, CA, 94025, USA
Paul Peluso
Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
Robert Van Buren
Department of Plant Sciences, University of Tennessee, Knoxville, TN, 37996, USA
Henrique Cestari De Paoli
School of Natural and Environmental Science, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Anne M. Borland
Smithsonian Tropical Research Institute, Apartado, Balboa, Ancón, 0843-03092, Republic of Panama
Klaus Winter

Authors

Xiaohan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Rongbin Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hengfu Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Shengqiang Shu
View author publications
You can also search for this author in PubMed Google Scholar
Haibao Tang
View author publications
You can also search for this author in PubMed Google Scholar
Degao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Deborah A. Weighill
View author publications
You can also search for this author in PubMed Google Scholar
Won Cheol Yim
View author publications
You can also search for this author in PubMed Google Scholar
Jungmin Ha
View author publications
You can also search for this author in PubMed Google Scholar
Karolina Heyduk
View author publications
You can also search for this author in PubMed Google Scholar
David M. Goodstein
View author publications
You can also search for this author in PubMed Google Scholar
Hao-Bo Guo
View author publications
You can also search for this author in PubMed Google Scholar
Robert C. Moseley
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth Fitzek
View author publications
You can also search for this author in PubMed Google Scholar
Sara Jawdy
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Meng Xie
View author publications
You can also search for this author in PubMed Google Scholar
James Hartwell
View author publications
You can also search for this author in PubMed Google Scholar
Jane Grimwood
View author publications
You can also search for this author in PubMed Google Scholar
Paul E. Abraham
View author publications
You can also search for this author in PubMed Google Scholar
Ritesh Mewalal
View author publications
You can also search for this author in PubMed Google Scholar
Juan D. Beltrán
View author publications
You can also search for this author in PubMed Google Scholar
Susanna F. Boxall
View author publications
You can also search for this author in PubMed Google Scholar
Louisa V. Dever
View author publications
You can also search for this author in PubMed Google Scholar
Kaitlin J. Palla
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Albion
View author publications
You can also search for this author in PubMed Google Scholar
Travis Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Jesse A. Mayer
View author publications
You can also search for this author in PubMed Google Scholar
Sung Don Lim
View author publications
You can also search for this author in PubMed Google Scholar
Ching Man Wai
View author publications
You can also search for this author in PubMed Google Scholar
Paul Peluso
View author publications
You can also search for this author in PubMed Google Scholar
Robert Van Buren
View author publications
You can also search for this author in PubMed Google Scholar
Henrique Cestari De Paoli
View author publications
You can also search for this author in PubMed Google Scholar
Anne M. Borland
View author publications
You can also search for this author in PubMed Google Scholar
Hong Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Gui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wellington Muchero
View author publications
You can also search for this author in PubMed Google Scholar
Yanbin Yin
View author publications
You can also search for this author in PubMed Google Scholar
Daniel A. Jacobson
View author publications
You can also search for this author in PubMed Google Scholar
Timothy J. Tschaplinski
View author publications
You can also search for this author in PubMed Google Scholar
Robert L. Hettich
View author publications
You can also search for this author in PubMed Google Scholar
Ray Ming
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Winter
View author publications
You can also search for this author in PubMed Google Scholar
James H. Leebens-Mack
View author publications
You can also search for this author in PubMed Google Scholar
J. Andrew C. Smith
View author publications
You can also search for this author in PubMed Google Scholar
John C. Cushman
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Schmutz
View author publications
You can also search for this author in PubMed Google Scholar
Gerald A. Tuskan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X.Y. conceived and initiated the Kalanchoë genome project, supervised the study and interpreted the data, and wrote the manuscript; R.H. carried out protein function characterization, data analysis, and wrote the manuscript; H.Y., S.J., and P.P. carried out genome sequencing and RNA-seq; J.J. and J.S. carried out genome assembly; S.S. and D.M.G. carried out genome annotation; H.T. carried out genome duplication, synteny analysis, and wrote the manuscript; J.H., S.F.B., and L.V.D. contributed to material, interpreted the data, and wrote the manuscript; C-M. W., R.V.B., and R.M. contributed pineapple genomics and gene expression data; D.A.W., R.C.M., and D.A.J. carried out convergent expression analysis; P.E.A. and R.L.H. carried out GO and metabolite data analysis; K.W. interpreted the data; J.A.C.S. interpreted the data and wrote the manuscript; E.F., R.M., H.C.D.P., A.M.B., and Y.Y. carried out metabolic pathway analysis; Z.Z. and T.J.T. carried out metabolite profiling; J.D.B. carried out phylogenetic analysis; H.-B.G. and H.G. carried out phylogenetic analysis and protein structure modeling; K.H. and J.H.L-M. carried out phylogenetic analysis, protein structure clustering, and wrote the manuscript; J.M.H. and K.J.P. carried out ploidy analysis; M.X., J-G.C. and W.M. contributed to protein function characterization; W.C.Y. carried out RNA-seq data analysis and interpreted the data; D.L. carried out stomatal and circadian gene analysis; R.A., T.G., J.A.M., and S-D.L. contributed to transcriptome and genome sequencing; J.G. contributed to transcriptome sequencing; J.C.C. contributed to transcriptome, genome sequencing, and wrote the manuscript; G.A.T. conceived the study and interpreted the data. All authors read and commented on manuscript.

Corresponding author

Correspondence to Xiaohan Yang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yang, X., Hu, R., Yin, H. et al. The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism. Nat Commun 8, 1899 (2017). https://doi.org/10.1038/s41467-017-01491-7

Download citation

Received: 12 January 2017
Accepted: 21 September 2017
Published: 01 December 2017
DOI: https://doi.org/10.1038/s41467-017-01491-7

This article is cited by

Genomic and transcriptomic analysis of sacred fig (Ficus religiosa)
- K. L. Ashalatha
- Kallare P Arunkumar
- Malali Gowda
BMC Genomics (2023)
Molecular mechanisms of adaptive evolution in wild animals and plants
- Yibo Hu
- Xiaoping Wang
- Fuwen Wei
Science China Life Sciences (2023)
Genome-wide identification and functional prediction of silicon (Si) transporters in poplar (Populus trichocarpa)
- Md Mahmudul Hassan
- Samir Martin
- Xiaohan Yang
Plant Biotechnology Reports (2023)
A high-quality Buxus austro-yunnanensis (Buxales) genome provides new insights into karyotype evolution in early eudicots
- Zhenyue Wang
- Ying Li
- Yongzhi Yang
BMC Biology (2022)
Genomic basis of the giga-chromosomes and giga-genome of tree peony Paeonia ostii
- Junhui Yuan
- Sanjie Jiang
- Yonghong Hu
Nature Communications (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.