The C4 photosynthetic pathway evolved to allow efficient CO2 capture by plants where effective carbon supply may be limiting as in hot or dry environments, explaining the high growth rates of C4 plants such as maize. Important crops such as wheat and rice are C3 plants resulting in efforts to engineer them to use the C4 pathway. Here we show the presence of a C4 photosynthetic pathway in the developing wheat grain that is absent in the leaves. Genes specific for C4 photosynthesis were identified in the wheat genome and found to be preferentially expressed in the photosynthetic pericarp tissue (cross- and tube-cell layers) of the wheat caryopsis. The chloroplasts exhibit dimorphism that corresponds to chloroplasts of mesophyll- and bundle sheath-cells in leaves of classical C4 plants. Breeding to optimize the relative contributions of C3 and C4 photosynthesis may adapt wheat to climate change, contributing to wheat food security.
One of the key biological innovations was development of the ability of an organism to use light as the source of energy to generate chemical energy (ATP and NAD(P)H) for metabolic activities1 in the process commonly known as photosynthesis2. Evolutionarily, six phyla of prokaryotic bacteria have the ability to photosynthesize3, five of them using anoxygenic photosynthesis with bacteriochlorophyll and only one, the cyanobacteria, having oxygenic photosynthesis with chlorophyll4. Endosymbiotic associations of cyanobacteria in eukaryotes resulted in their ability to photosynthesize through chloroplasts in the process designated as “photosyntax” or “photosynthesis” in 1893 by Charles Reid Barnes5. Chemical energy generated from light energy is captured and used to synthesize organic compounds in higher plants in ‘dark reactions’6. There are many different photosynthetic pathways reported in higher plants7; four types viz., C3, C4, CAM (Crassulacean acid metabolism), and C3-C4 intermediates are widely known, while, C4-like (less advanced C4), C3-CAM, and C4-CAM intermediates have also been reported. These photosynthetic pathways, able to use CO2 as a carbon source, evolved in cyanobacteria around 3.5 billion years ago8. The key enzyme in C3 photosynthesis, ribulose diphosphate carboxylase (RuBisCO), was reported to have evolved around the same time as cyanobacteria9. The C4 pathway originated approximately 30 Mya (million years ago)10 and was first described 50 years ago11. The pathway provides enhanced radiation- water- and nitrogen- use efficiency12 especially in sub-optimal environments10,13.
Three classical C4 photosynthesis subtypes, NADP-ME (NADP- dependent malic enzyme), NAD-ME (NAD- dependent malic enzyme) and PEPCK (phosphoenolpyruvate carboxykinase) have been defined based upon the decarboxylation reactions involved14. These photosynthetic pathways explain the high growth rates of C4 plants such as maize. Anatomical, biochemical, and molecular evidence has been commonly used to distinguish C4-(sub)types from C3-types15. Kranz anatomy with reactions compartmentalized in different cell types has been considered essential for C4 photosynthesis16 but spatial compartmentalization in a single-cell has been demonstrated more recently17. The stem and petiole of C3 plants (tobacco and celery) was reported to accomplish NAD-ME type C4 photosynthesis in cells surrounding vascular bundles18. Photosynthesis in cereal grains is less well defined. Ear photosynthesis in wheat contributes from 10% to 44% of grain yield19. Grain photosynthesis accounts for 33-42% of this photosynthesis depending on the genotype and environment20.
Wheat is a major food crop critical to global food security. The current increase in wheat production of around 1% per year is not keeping pace with the rate of yield growth required to achieve the target of doubling crop production by 205021. The likely impact of climate change makes progress in advancing wheat productivity more urgent. Increasing total plant biomass through efficient carbon capture by photosynthesis is now more crucial in improving wheat productivity since advances in grain yield by improving harvest index have plateaued22. Plants with the C4 pathway are known to contribute 25% of total photosynthesis although they represent just 3% of species10. Converting C3 crops to C4 provides the possibility of improving yield by 30% through improved water- and nitrogen- use efficiency23. Engineering C3 food crops like wheat and rice to use the C4 pathway has long been explored to enhance global food security24. We now report an analysis of the transcriptome of genes associated with C4 photosynthesis in the developing wheat grain. Genes identified as transcripts were located in the genome and their sequences analysed to determine likely specificity. This allowed an evaluation of substantial new evidence for C4 photosynthesis in wheat grains.
Remarkably, transcriptome analysis and functional annotation of genes expressed in developing wheat grains revealed the presence and expression of all genes specific to NAD-ME type C4-photosynthesis. When added to earlier evidence dispersed in the literature, the present discoveries suggest the functioning of a form of C4- photosynthesis specifically in the developing wheat grain. The transcriptome of the developing caryopsis from 35 diverse wheat genotypes (31 and 32 genotypes respectively from 14 and 30 days-post-anthesis stage with 28 genotypes in common) was analyzed by RNA-Seq. Annotation of the differentially expressed genes in the wheat grain transcriptome between 14 and 30 dpa (days-post-anthesis) indicated the presence of NAD-ME type C4 photosynthesis during wheat grain development. This was an unexpected finding with wheat being a well-known C3 crop. Wheat genes involved in C4 photosynthesis, the number of copies expressed in developing wheat grains and their C4 specificity (based on cytological and evolutionary evidence) are listed in Table 1.
Phosphoenolpyruvate carboxylase (ppc) genes were localized in wheat on the long arms of chromosomes 3 and 5. The mean expression value (in RPKM) for ppc across 31 genotypes at 14 dpa (chromosome 3) was 36.2 (Fig. 1A, sum of three sub-genomes A, B, and D) while only 0.29 (mean of three growth stages – Z10, Z23, and Z71 with the expression values on the Y-axis representing the sum of the three sub-genomes) for leaves25 (Fig. 2A), indicating a 125 fold up-regulation in the developing wheat caryopsis. Conversely, ppc from chromosome 5 was upregulated in leaves (Fig. 2A). It is well-known that C4 plants have less RubisCO protein (reflecting transcript abundance) than C3 plants26. The mean rbcS gene expression value was 512.3 and 39166 for the wheat caryopsis at 14 dpa and leaves respectively indicating a 76 fold down-regulation in the developing wheat caryopsis. This shows an enormous, 9500 fold, difference between developing wheat caryopsis and leaves for the relative expression of ppc and rbcS genes.
Aspartate aminotransferase (aat; also known as got) is the most up-regulated among six C4 pathway genes in the developing wheat caryopsis. This is also the most up-regulated gene in the leaf tissues between C3 and C4 plants26. Of six copies (in each sub-genome) of the aat gene in wheat, only two copies were the C4 type (cytoplasmic 3L – aat1 and mitochondrial 7L – aat2). RNA-Seq analysis indicated that these genes were differentially up-regulated at 14 dpa in the developing caryopsis (Fig. 1B) when compared with leaves (Fig. 2B)25.
Two copies of malate dehydrogenase (mdh) gene were localized on the long and short arm of chromosome 1 (cytoplasmic – mdh1) and chromosome 5 (mitochondrial – mdh2) respectively across the three sub-genomes. The gene copy from chromosome 1 was differently expressed (Figs 1C and 2C) compared to the one from chromosome 5 in both grain and leaf tissues25. The mitochondrial targeted mdh2 gene from chromosome 5 is likely to be involved in C4 photosynthesis.
Two copies of the NAD-dependent malic enzyme coding gene (me2) with one each targeted to chloroplast and mitochondria were localized on chromosomes 1 and 2 respectively. The mitochondrial targeted gene (chromosome 2) copy supports C4 photosynthesis, converting malate into pyruvate with release of CO2 for further fixation through the C3 cycle15. The mitochondrial isoform was up-regulated in the developing wheat caryopsis (Fig. 1D) while, the plastidic isoform was up-regulated in leaves (Fig. 2D)25.
Two copies of alanine transaminase (gpt) genes were localized to the short arm of chromosomes 2 and 5 of hexaploid wheat. This cytoplasmic enzyme converts pyruvate to alanine and vice-versa in bundle sheath and mesophyll cells respectively in a classical NAD-ME type C4 pathway14. Both genes were expressed in similar proportions in the developing wheat caryopsis at 14 dpa (Figs 1E and 2E); while the gene on chromosome 2 was more highly expressed in leaves25.
Pyruvate, orthophosphate dikinase (ppdk) gene was localized to the long arm of chromosome 1 in hexaploid wheat. All four gene copies (although a full length sequence was not available) were used to assess the RPKM expression levels in the developing wheat caryopsis at 14 dpa (Fig. 1F) and in leaf (Fig. 2F) tissues25. Earlier reports indicate the role of a dual promoter in regulating a single gene copy during light and dark in the chloroplast and cytoplasm respectively with the second promoter region in the first intron for cytoplasmic expression27. Aoyagi and co-workers showed the presence of PPDK and RubisCO in the green pericarp, but failed to envision the possibility of C4 photosynthesis due to the lack of Kranz anatomy in developing wheat grains28.
Six genes (excluding carbonic anhydrase) were involved in the NAD-ME type C4 pathway, phosphoenolpyruvate (PEP) carboxylase (ppc), aspartate aminotransferase (aat; also known as got), malate dehydrogenase (mdh), NAD- dependent malic enzyme (me2), alanine aminotransferase (gpt), and pyruvate, orthophosphate dikinase (ppdk)15. Grain specific expression of genes involving NAD-ME type C4 photosynthesis viz., ppc, aat, mdh, me2, gpt, and ppdk; in all three (A, B, and D) sub-genomes (Fig. 1) indicates a possible evolutionary diversification point well before the speciation of the diploid progenitors in the Triticeae tribe. Endosperm and aleurone transcripts29 do not express all of these genes demonstrating that the C4 pathway is restricted to the wheat pericarp.
Varied expression pattern between wheat genotypes
The presence of all C4 specific genes in the genome confirms that natural selection may have already explored the options being considered by plant breeders30. The levels of expression for all six genes at 14 dpa in NAD-ME type C4 pathway varied across 31 genotypes (Fig. 3) suggesting potential for genetic selection for this trait in wheat breeding.
C4 specificity of gene sequences
Four of the six genes involved in NAD-ME type C4 photosynthesis, (aat, mdh, me2, and ppdk) had sub-cellular targeting that suggests C4-type specificity15. The other two genes (ppc and gpt) require sequence information to distinguish between the copies specific for C3- or C4- pathways. Analysis of gpt genes in wheat suggested both C3 and C4 forms were expressed at similar levels (Figs 1E and 2E) across photosynthetic and non-photosynthetic tissues. While the ppc gene copies clearly show different expression patterns between developing grains and leaves (Figs 1A and 2A); sequence differences are the only way to distinguish the C3- and C4- isoforms. Specific amino acid substitutions have been associated with C4 functionality13. Increased tolerance to feedback inhibition by malate involves G884 (Glycine) in C4-isoforms rather than R884 (Arginine) as found in C3-isoforms. The translated sequence of the ppc gene from chromosome 3 (S885) and 5 (R891) of wheat cDNA (IWGSC – international wheat genome sequencing consortium, release-23 version) indicates the gene copy from chromosome 5 is C3-type; while the chromosome 3 copy is non-C3 type. The gene sequences from wheat and related species31 were analyzed using the translated amino acid sequence of the ppc gene (IWGSC cDNA database release-23) from chromosome 3. Results indicated that most of the Triticeae tribe members have five copies of ppc gene (Table 2) although in the hexaploid wheat cDNA database we found only two copies (3L and 5L). IWGSC cDNA database (release-23)31 was used to perform tblastn analysis with the translated ppc gene sequence confirming that gene sequence copies from chromosomes 3S, 6 and 7 are not in frame suggesting the presence of insertions or deletions in these genes. However, one ppc gene copy from all Triticeae members had S885indicating a non C3-type; while the other four copies revealed a C3-type – R891 (the corresponding amino acid position) across all the Triticeae members studied (Table 2). Since the amino acid position is neither R nor G, we studied different species acting as diversification points in the evolution of these species in order to compare them with respect to known C4 types (Table 3). This gave an indication that from Bryophytes to Angiosperms, the C3 type amino acid position was invariably conserved with ‘R’ (Table 3). Whereas the C4 type amino acid position was either S (Panicum and Triticeae tribe) or Q (Alloteropsis, Setaria) or G (Alloteropsis, Panicum, Zea, and Sorghum) or I (Amaranthus) depending on the species or taxonomic group (Table 3).
Wheat is widely known as a classical C3 plant. Close examination of the literature shows many reports of components of the case for C4 photosynthesis in the grain especially in early studies. However, this evidence has been overlooked because of the knowledge of C3 photosynthesis in the leaves and a lack of understanding of the possibility of different pathways in different parts of the plant. Indeed many studies have attempted to explain away the evidence that did not fit with the knowledge that wheat was a C3 plant. This study has identified a complete set of C4 specific genes in wheat genome for the first time. This finding addresses the apparent anomaly of this subfamily (Pooideae) of the Poaceae being uniquely seen to lack C4 photosynthesis. We have also shown for the first time that all the required genes are expressed in the required compartmentalization, specifically in the pericarp, a tissue with an anatomy that is suitable for supporting a C4 pathway. The possibility of photosynthesis in the pericarp of wheat grains was predicted in the early 1960s32. Phosphoenolpyruvate carboxylase (PPC) from the wheat or barley pericarp tissues of developing grain was reported to be 50-100 times as active in carbon fixation as ribulose diphosphate carboxylase (RuBisCO)33. Based on the enzyme activity for malate dehydrogenase, malic enzyme, and pyruvate-orthophosphate dikinase in pericarp tissues of developing grain, Duffus and Rosie33 indicated the possibility of C4 photosynthesis. A little later, Wirth, et al.34 studied different reproductive parts from wheat and oat – glume, lemma, palea, and pericarp – along with leaves and reported that the pericarp tissues of developing grains seemed to “possess carbon metabolism different to that of the other tissues”. They also analyzed and reported the possibility of refixation of the CO2 released through respiration or photorespiration. Assimilation of 14CO2 to malate and 3-phosphoglyceric acid in wheat ears and flag leaf respectively; along with higher enzyme activities for enzymes of C4 and C3 metabolic pathways in ears and flag leaf respectively suggested the possibility of C4 photosynthesis in ears35. Carbon isotope discrimination (Δ) values were used to distinguish plants between C3- and C4-type36. Although wheat was considered a C3 plant, Δ values were used to study the plants’ water-use- or transpiration efficiency37,38. Their results indicate a clear difference between flag leaf and grain Δ values in different wheat genotypes. Although the difference is not as distinct as it is with classical C4 photosynthesis. This might be due to either inefficient less advanced C4 type photosynthesis or the fact that grain photosynthesis accounts for only 33-42% of ear photosynthesis20 with the remainder translocated from leaf or stem tissues with C3-type photosynthesis thereby diminishing the difference in Δ values between flag leaf and grain to a marginal level. Similarly but in reverse, in a maize plant with C4-type, maize husk leaves were reported to be C3-type and their Δ values were marginally higher than leaves39.
In spite of this evidence (enzyme activity, 14CO2 in malate, Δ values), earlier researchers failed to explore C4 photosynthesis in wheat grains due to the view that Kranz anatomy was required for C4 photosynthesis16,28. In 2001 and 2002, the occurrence of C4 photosynthesis without Kranz anatomy was reported in single cells and in the petioles of C3 plants respectively17,18.
In the late 1990s, there were reports of the C4 pathway being found selectively at different developmental stages of some plants (Salsola spp. and Haloxylon spp.) of cotyledons and leaves exhibiting C3 and C4 type photosynthesis respectively40,41. Similarly there have been reports on the selective use of the C4 pathway in different environments like terrestrial or submerged situations42,43 or high or low CO244. Selective expression of the C3 pathway was reported in the husk leaves (hypsophylls) of the maize plant45 which is otherwise a C4 plant. Evidence of 4-carbon compounds specifically in wheat leaf bases46 agreed with a much later report of C4 pathways in C3 plants18. Altered C4 and C3 enzymatic activity has recently been reported in wheat ears under water stress47 but the significance of this was not clear given the C3 status of wheat. This evidence suggests the presence of a diversified range of regulatory patterns of C4 pathway in plants with ontogeny and varied environmental cues. Operation of different pathways (C3 or C4) at different growth stages allows wheat to have a lifecycle that extends across seasons with varying environments (cool, wet during vegetative growth; hot, dry during grain filling).
Molecular and cytological evidence
Functional annotation and differential expression of C4-specific gene copies (Figs 1 and 2) for genes of NAD-ME type photosynthesis specifically in developing wheat grains adds evidence for the C4 pathway in wheat grains as suggested in early reports34,35. With multiple copies in a genome, species preferentially co-opt the same neo-functionalized gene lineage for C4 photosynthesis48 although these genes were present well before the evolution of the C4 pathway but with different anaplerotic functions49.
Reports indicate that cross- and tube-cells in pericarp of developing wheat grain are photosynthetic in nature and contribute to the grain weight50. Thorough re-examination of this report indicates the presence of numerous mitochondria, and dimorphic chloroplasts – stacked grana in cross-cells and reduced stacking in tube-cells being structurally similar to classical C4 types51. The presence of numerous mitochondria specifically in bundle sheath cells of NAD-ME type C4 pathway has also been reported52. These pieces of cytological evidence in addition to our molecular evidence suggests NAD-ME type C4 photosynthesis operates in developing wheat grains (Fig. 4) with cross- and tube-cells paralleling mesophyll and bundle sheath cells in a classical C4 pathway. In contrast to the classical C4 photosynthesis that is associated with little or no starch granules in mesophyll cells53, the presence of starch granules in cross-cells (mesophyll like) was reported by Morrison50. This led us to question the possibility of NAD-ME type C4 photosynthesis in wheat grains. However, there is evidence for the presence of RuBisCO in both mesophyll and bundle sheath cells of young amaranth leaves54 suggesting a C3 cycle in both mesophyll and bundle sheath cells. In some Flaveria spp., reports of the presence of RuBisCO in both mesophyll and bundle sheath cells supporting both the C3 and C4 cycle simultaneously led to their classification as having C4-like type (less advanced) photosynthesis7. Occurrence of the C3 cycle (presence of RuBisCO) in both mesophyll and bundle sheath cells along with the C4 pathway in some species might be due to the fact that compartmentalization of RuBisCO is the final step in the evolution of C4 from C3 photosynthesis55. These considerations lead us to propose the occurrence of C4-like type (less advanced) photosynthesis in developing wheat grains through the cross- and tube-cell layers of pericarp paralleling the mesophyll and bundle sheath cells of classical C4 photosynthesis (Fig. 4).
Taxonomical and evolutionary evidence
Around 41% of grasses are known to fix carbon through the C4 pathway56. Hence, an overview at evolutionary scale linking speciation events with C4 photosynthesis might shed light on the evolution of the C4-like type photosynthetic pathway in developing wheat grains. The Poaceae family is monophyletic and consists of 12 subfamilies with three at the basal level, followed by the BOP and PACMAD clades consisting of three and six subfamilies each with the Triticeae tribe included in the subfamily Pooideae (cool season grasses) of the BOP clade56. To date, no species from the Pooideae have been reported to be C4. The Aristidoideae (PACMAD) subfamily has been reported to have at least two independent evolutions of the C4 pathway56. In this study, knowledge of specific amino acids (G in C4 and R in C3) in the ppc gene product required for efficient carbon fixation by the C4 pathway13 was used to show that wheat and all related species (including Hordeum and Brachypodium) had five ppc copies in their genome with four of them having the amino acid R indicating their C3 nature while one copy (3L, as in hexaploid wheat) has an S – a non-C3 type in place of R except Brachypodium (Table 2).
The C3 specific amino acid position (R) was apparently conserved (Table 3) from Bryophytes (around 450Mya) to Angiosperms. The amino acid position with C4 specificity appears to have evolved at least four times in the last 30Mya (origin of C4) with either S (Panicum and Triticeae tribe) or Q (Alloteropsis, Setaria) or G (Alloteropsis, Panicum, Zea, and Sorghum) or I (Amaranthus). This suggest that various amino acid substitutions at that site might result in differing efficiency of carbon fixation through the C4 pathway by altering tolerance to feedback inhibition by malate13. Analysis of enzyme kinetics with each of the four C4-specific amino acids individually might help to rank their photosynthetic efficiency. However, the weakest form among the four will probably be much more efficient in carbon fixation than the C3 type (R). The presence in wheat of an amino acid specific to a known C4-type (S in Panicum laetum and P. miliaceum) is strong evidence when taken together with the grain specific pattern of expression of the C4 specific ppc gene (Fig. 2A). The tribe Brachypodieae (Brachypodium distachyon) has amino acids corresponding to the C3-type for all the five copies; while members from tribe Triticeae (Aegilops, Hordeum, Triticum) have one copy of the C4-type and four copies of the C3-type (Table 2). This fits with the evolutionary time line for C4 photosynthesis around 30Mya10; with Brachypodieae evolving around 35Mya57 having only C3-type genes (Brachypodium). Unfortunately, there are no diversification points between Brachypodium (35 Mya) and Hordeum (11.6 Mya)57 to establish the exact timing of C4 evolution in the Pooideae tribe. Derived traits like those associated with C4 photosynthesis appear at later evolutionary stages and are expressed later in plant development58. This is consistent with the observations of C4 photosynthesis in wheat and its relatives specifically in the grain.
Based on this molecular, cytological, taxonomical and evolutionary evidence, we propose the occurrence of C4-like type photosynthesis specifically in developing wheat grains. Recognition of both C3 and C4-like type photosynthetic pathways in wheat provides a basis for interpretation of wheat performance as a crop adapted to maturation in hot dry environments, suggesting that the plant may rely more on C4 photosynthesis under conditions of water stress during the grain filling stage. Photosynthates from pericarp, glumes and awns are critical59 when other parts of the plant lose photosynthetic capacity due to terminal drought often experienced in the environments in which wheat evolved. This may be especially important in the development of wheat varieties to adapt to climate change60 and associated temperature extremes. The operation of C4 photosynthesis specifically in these tissues provides an adaptive advantage to the wheat plant while C3 photosynthesis is adequate during early vegetative growth under more temperate conditions. The potential for genetic manipulation to extend C4 photosynthesis throughout the wheat plant seems much more realistic given the existing expression of the entire pathway in the grain. This supports the view that plant species have evolved specific photosynthetic pathways in different organs, at specific developmental stages and in different environments suggesting that the classification of plants as C3 or C4 or CAM in a broad fashion cannot simply be based upon leaf anatomy. Research to establish the variation in flux through this pathway in wheat and its progenitors will shed much light on the share of carbon fixation through the C3 and C4 pathway under varying environmental conditions. This has the potential to suggest new options for the development of higher yielding wheat genotypes.
Thirty-five wheat genotypes viz., Amurskaja, Arnhem, Banks, Bativa, Beyrouth-3, Bobwihte-26, Bowerbird, Des-367, Dollarbird, Ellison, Garbo, Giza-139, Gregory, Huandoy, India-37, India-211, India-259, Iraq-46, JingHong-1, Kite, LermaRojo, Martonvasari-13T, Punjab-7, Qalbis, Saturno, Sunco, Sphaerococcum, Tunis-24, Greece-25, NW-25A, NW-51A, NW-93A, NW-108A, Pelada, and Vega were used for transcriptome analysis. Seeds for these genotypes were procured from the Australian Winter Cereals Collection. Seeds were germinated and plants grown under controlled conditions as described in Furtado, et al.58. Developing grains were collected from wheat spikes at 14 days- and at 30 days-postanthesis (dpa) as described elsewhere58.
RNA isolation, library preparation and NGS sequencing
RNA isolation, cDNA synthesis, library preparation and next generation sequencing was carried out and described by Furtado, et al.58. Libraries for 31 samples from 14 dpa and 32 samples from 30 dpa with 28 genotypes in common were prepared and sequenced as described in Furtado, et al.58. Libraries were not prepared for four cultivars viz., NW-93A, NW-108A, Pelada, and Vega at 14 dpa, and three cultivars viz., Greece-25, NW-25A, and NW-51A at the 30 dpa stage due to lack of sufficient starting material.
Sequencing data processing and analysis
Sequencing data obtained was imported into CLC genomics workbench ver. 7.0.4 (CLC Bio-Qiagen, Denmark) and further processing and analysis were done within this environment unless otherwise stated. Quality checking, trimming, and RNA-Seq analysis were performed as described in Furtado, et al.58 using the TaGI (Triticum aestivum Gene Indices, The Computational Biology and Functional Genomics Laboratory, Dana Farber Cancer Institute and Harvard School of Public Health) cDNA database as reference, containing 221,925 sequences (release 12.0)61.
Differential transcript and statistical analyses
Transcripts that were differentially expressed between 14 and 30 dpa were analyzed using the RNA-Seq experimentation tool with default parameters. Statistically significantly differentially expressed transcripts were identified using both Gaussian (mean based) and Empirical analysis of Differential Gene Expression (EDGE, count based) statistics facilitated through CLC workbench (CLC Bio-Qiagen, Denmark) with p-value using false discovery rate (FDR) corrected least the significant difference set at 0.01 level.
Functional annotation and data mining
In total, 26,477 transcripts that are common for both Gaussian and EDGE statistics were significant at FDR corrected value 0.01. Among them, 319 and 181 transcripts were unique to 14 dpa and 30 dpa respectively; while 16237 and 9740 transcripts were differentially up-regulated at 14 dpa and 30 dpa respectively. Transcript sequences for these four groups (unique 14 dpa, unique 30 dpa, differential 14 dpa and differential 30 dpa) were extracted from the reference database (TaGI) and subjected to blastx analysis against the non-redundant protein database. Blast results obtained were converted to a BLAST2GO project file and exported in “.dat” format files using the plug-in version within CLC workbench (CLC Bio-Qiagen, Denmark). Functional annotation for these four groups was performed independently using BLAST2GO Pro ver 3.0.10 with default parameters62. Annotations were augmented using InterProScan and followed by Run-annex options. Annotations pertaining to the plant database were retained using the GO (gene ontology)-slim option. Finally, KEGG (Kyoto encyclopedia of genes and genomes) pathway maps for these four annotated sequence groups were retrieved using GO-enzyme code mapping option. The differential 14 dpa group highlighted the presence of a complete C4 photosynthetic pathway existing in developing wheat caryopsis.
Chromosomal localization and IWGSC transcript retrieval
Based on enzyme code mapping, TaGI transcript IDs pertaining to those enzyme code (EC numbers) for the genes involved in the C4 photosynthetic pathway from the differential 14 dpa group were retrieved using CLC workbench (CLC Bio-Qiagen, Denmark). A total of 62 transcripts for the six genes (phosphoenolpyruvate carboxylase – ppc; aspartate aminotransferase – aat; malate dehydrogenase – mdh; decarboxylating dehydrogenase – me2; alanine aminotransferase – gpt; and pyruvate, orthophosphate dikinase - ppdk) were retrieved. Blast searches for the 62 transcripts from the TaGI database were performed against the IWGSC cDNA database containing 100,717 sequences (release-23)31 for retrieval of IWGSC transcripts (since the TaGI transcripts are lesser in length and mostly incomplete).
Modified reference and RNA-Seq analysis
Based on blast analyses using the 62 transcripts of TaGI as reference55 transcripts from the IWGSC cDNA database (release-23)31 were obtained. Using sub-genome sequence information and sequence alignment, 10 of 55 transcripts were found to be actually five genes each being two parts of the same transcript with or without overlap. Based on homology and sequence alignment between the sub-genome copies, those 10 transcripts were stitched into five transcripts resulting into a total of 50 transcripts for six genes that accomplish NAD-ME type C4 photosynthesis. In order to construct a modified reference, 10 transcripts (that are used to stitch) were replaced with the five stitched transcripts in the 100,717 sequences of the IWGSC cDNA (release-23)31. The resulting database containing 100,712 sequences was named “modified IWGSC cDNA (release-23)”and used for performing RNA-Seq analysis as described above58 to obtain RPKM values for the 31 genotypes at the 14 dpa stage. Although researchers use FPKM instead of RPKM for paired-end reads, we used RPKM with an option of counting mapped paired-end reads as “two” and singleton reads that are mapped as “one” to avoid confusion between FPKM and RPKM terminologies.
RNA-Seq analysis for tissue specific transcriptome data
Raw reads (100 bp paired-end sequencing on Illumina HiSeq2000) of different tissues (leaf, and grain) at three different growth stages for hexaploid wheat (‘Chinese Spring’) were available online25. These raw sequence reads were downloaded, and processed through the CLC workbench (CLC Bio-Qiagen, Denmark). Quality checking, trimming and RNA-Seq analysis using the modified IWGSC cDNA (release-23) containing 100,712 sequences as reference were performed to obtain RPKM values and represented in pictorial form.
Taxonomical and evolutionary relation for C4-specificity
Specific amino-acid positions for PPC (PEPCase) that are functionally related to C3 and C4-specificity were reported recently13. In order to identify these in wheat and related species (Table 2), whole genome sequence details63 were downloaded and translational blast analysis was performed using CLC workbench ver. 8.5.1 (CLC Bio-Qiagen, Denmark).
Similar analyses were performed for species (for which genome sequence was available) including taxa from bryophytes to angiosperms31,64,65 corresponding to various diversification points in an evolutionary timeline (Table 3). Although whole genome data for some well-known C4 species was not available, ppc gene sequences in public databases was used to study the evolutionary pattern at specific amino acid positions (Table 3) that are functionally related to C3 or C4 specificity.
How to cite this article: Rangan, P. et al. New evidence for grain specific C4 photosynthesis in wheat. Sci. Rep. 6, 31721; doi: 10.1038/srep31721 (2016).
We thank Grain Foods CRC for supporting the generation of the transcriptome data. PR is supported by an Indo-Australian Career Boosting Gold Fellowship from the Department of Biotechnology, Government of India.