Plant roots are the first organs sensing and responding to salinity stress, manifested differentially between different root types, and also at the individual tissue and cellular level. High genetic diversity and the current lack of an assembled map-based sequence of the barley genome severely limit barley research potential. We used over 580 and 600 million paired-end reads, respectively, to create two de novo assemblies of a barley landrace (Sahara) and a malting cultivar (Clipper) with known contrasting responses to salinity. Generalized linear models were used to statistically access spatial, treatment-related, and genotype-specific responses. This revealed a spatial gene expression gradient along the barley root, with more differentially expressed transcripts detected between different root zones than between treatments. The root transcriptome also showed a gradual transition from transcripts related to sugar-mediated signaling at the root meristematic zone to those involved in cell wall metabolism in the elongation zone, and defense response-related pathways toward the maturation zone, with significant differences between the two genotypes. The availability of these additional transcriptome reference sets will serve as a valuable resource to the cereal research community, and may identify valuable traits to assist in breeding programmes.
Barley (Hordeum vulgare L.) is an essential food, feed and brewing crop, and a model system for temperate cereals. As a glycophyte, barley suffers substantial yield loss when grown under saline conditions1. Plant roots are the first organs sensing and responding to environmental stresses, including salinity stress, and have key functions in water and nutrient uptake and rhizosphere dynamics as well as anchoring the plant2. These responses can be manifested differentially between different root types, and also at the individual tissue and cellular level, as the longitudinal structure of plant roots contains partially overlapping specialized zones of development: The plant root tip includes the root cap and the apical meristem, where cell division and elongation originate and proceed along a developmental gradient toward the mature root3,4. The elongation zone is where newly generated cells increase in length, and can be further sub-divided into the distal elongation zone, a transition zone between meristematic and elongation zone. The distal elongation zone is bordered by the elongation zone, where cells cease division but elongate maximally, and which is adjacent to the maturation zone, where cells can further differentiate into specialized cell types, such as root hairs.
Genome-wide expression profiles of mRNA under both control and stress conditions have revealed striking cell-type and tissue-specific responses in plant roots5,6,7. These studies show that the analysis of spatially (and temporally) resolved transcriptional signatures along longitudinal root sections can be used to infer root developmental processes, and to predict previously unknown cellular functions through co-expression with genes of known function. This strongly suggests that analyses of whole plant tissues can dilute out information important to understand the complex molecular programmes that define root development6 and responses to stress8.
Relatively little is understood of barley roots during early developmental stages and under exposure to salinity stress. This is due in part to the lack of comprehensive barley root sequence datasets that limits the scope of investigations into the molecular and genetic basis of root traits. The assembly of genome sequences for barley cultivars “Morex”, “Bowman”, and “Barke”, was completed recently. However, the extensive genetic diversity (estimated 370,796 accessions of 31 Hordeum species9) and the current lack of an assembled map-based barley reference genome sequence still limit research potential. Given the substantial divergence among cultivars and ecotypes, de novo transcriptome assemblies should not be limited to species without published reference genomes: recent studies have shown that de novo transcriptome assemblies of plants with sequenced genomes can improve the annotation of diverse cultivars and identify cultivar-specific genes10. The comparison of polymorphisms against a single reference genotype does not necessarily provide a complete representation of the genetic diversity of a species, and may underestimate the variability among different genotypes. To discover previously unrecognized transcripts not part of the reference genome as well as to capture potentially novel transcript diversities between the different barley genotypes we have performed two separate de novo assemblies, one for the barley malting cultivar (cv.) Clipper (Australia), and one for the landrace (LR) Sahara (North Africa).
Clipper and Sahara are of particular interest due to their contrasting salt tolerance11,12,13. Clipper contains the Na+ exclusion locus HvNax4 which was shown to reduce shoot Na+ content by 12–59% (g−1 dry weight)14. Shelden et al.15 reported altered root growth phenotypes between Clipper and Sahara in response to the early phase of salinity, with Clipper able to maintain a significantly higher relative root elongation rate (RRER) than Sahara.
The key aims of this study were: (a) to provide new comprehensive resources for the identification of novel potential candidate genes for salinity tolerance in barley; and (b) to quantify the changes in transcript expression under varying conditions in functionally different zones of the root. We achieved this by constructing two independent de novo transcriptome assemblies of three key zones of barley roots (meristematic zone, elongation zone, and maturation zone) of an Australian malting cultivar and a North African landrace, before and after salinity stress, based on next generation RNA sequencing (RNA-Seq) to uncover the expressed gene complement. Comparative transcriptome profiling provided insights into the molecular and physiological functions in barley roots, and spatially resolved transcriptional information to reveal both unique genotype-specific as well as treatment-specific features. These reference transcriptomes provide resources for further molecular investigations of the Hordeum genus to reveal novel processes of root growth and development under salinity stress, and to identify candidate genes that will inform future crop improvement programs.
High quality de novo transcriptome assemblies of root longitudinal sections obtained from two barley genotypes
Paired-end RNA-Seq libraries were constructed for three root sections of the meristematic zone (including root tip; Z1), elongation zone (Z2), and maturation zone (Z3) of the roots of barley cv. Clipper, and LR Sahara, grown under control and salt-treated (100 mM NaCl) conditions in quadruplicate. Each sample, regardless of genotype, treatment or biological replicate, was represented by at least 32 million reads, after trimming and confirmation of a tag density sufficient for quantitative analysis of gene expression. The sequence reads were assembled using the Trinity assembler16,17, and the de novo transcriptome assembly summary statistics are summarised in Table 1.
We aligned quality-trimmed reads of each sample independently back to the transcriptome to identify per-sample variability. For Clipper samples, a mean of 10.2% of reads did not align to the Clipper assembled transcriptome (SD 0.4), 40.9% of reads aligned in a single concordant location (SD 1.9), and 48.9% (SD 1.7) concordantly aligned to more than one location in the transcriptome. For Sahara, 9.2% of reads did not align to the Sahara assembled transcriptome (SD 0.4), a per-sample average of 38.9% aligned concordantly exactly once (SD 1.6), whilst 51.9% of paired-reads aligned to more than one contig in the Sahara transcriptome (SD 1.3).
Validation of genotype assemblies
To validate each assembly, we pursued four separate approaches: (i) validation of core eukaryotic genes, (ii) reciprocal cross-validation of each genotype, (iii) comparative genomics analysis using available public plant genome assemblies from Phytozome (http://www.phytozome.net, version 10.2) and (iv) validation of the presence of key salt-responsive genes.
Presence of widely conserved eukaryotic genes
We used the Core Eukaryotic Genes Mapping Approach (CEGMA) program v2.4.01031218 to detect the presence of 248 widely conserved eukaryotic core genes (CEG) that are considered to have low frequencies of gene family expansion. For the Clipper assembly, we found that 242 of 248 CEGMA genes are complete in our gene list (97.6%), and an additional 6 genes are represented as fragments, with a total representation of 100% of the eukaryotic core gene set (Supplemental Table S1). For the Sahara assembly, we found that 240 of 248 CEGMA genes are complete (96.8%) in our gene list, and an additional 5 genes are represented as fragments, with a total representation of 98.8% of the eukaryotic core gene set (Supplemental Table S2). In comparison, 95% and 95.6% of the complete CEGs were previously identified in tobacco (Nicotiana benthamiana)19 and alfalfa (Medicago sativa L.) transcriptomes20, respectively. Using completeness and presence of CEGs as metrics of transcriptome assembly success, this suggests we have generated two high-quality transcriptome assemblies of Clipper and Sahara.
Reciprocal cross-validation shows high sequence similarity between barley genotypes
The mean expressed contig number was between 54,000 and 65,000 at the transcript level (between 33,000 and 41,000 at the gene level) in Clipper and Sahara, respectively, with the maturation zone (Z3) having a slightly higher number of expressed contigs in both genotypes compared to the other two zones (Fig. 1A).
To assess sequence similarity between genotypes, we performed a bi-directional sequence similarity analysis (see Methods) using Blast-like alignment tool (BLAT21). As shown in Fig. 1B, over 87% of Clipper contigs have similarity to Sahara with at least 90% identity. For Sahara contigs, over 89% of contigs had the highest scoring similarity to a Clipper contig with at least 90% identity.
Comparative genomics with existing genome-based resources
To characterize and approximate the coverage of sequenced and assembled transcripts representing common gene loci to further support transcriptome completeness of the de novo assembly, we performed pairwise BLASTP of Clipper and Sahara predicted proteomes to thirteen angiosperms using NCBI BLAST+22. Transcriptomes of Clipper and Sahara show widespread orthologous groups to the Munich Information Center for Protein Sequences (MIPS) barley cultivar assemblies (Fig. 2A) using OrthoFinder23. A total of 317,262 orthologous groups (OGs) involved one or more of the five barley genotypes. Of these, Clipper contigs were part of 72,873 OGs and Sahara contigs participated in 77,286 OGs. A total of 80,039 Clipper contigs, about 58% of contigs in the transcriptome, were part of an OG, with a similar number found in Sahara (79,885 Sahara contigs were part of an OG, about 61%). 13,080 OGs contained predicted proteins from all five Hordeum genotypes.
In addition, we performed sequence comparisons to identify relationships between Clipper and Sahara to dicot (Arabidopsis thaliana) and monocot model plants (Oryza sativa) (Fig. 2B) as well as other plant species with key evolutionary relationships obtained from Phytozome24. This included the most ancient angiosperm Amborella trichopoda, the emerging model plant Aquilegia coerulea, the tree species models Eucalyptus grandis and Populus trichocarpa, the legume model Medicago truncatula, and as well the Poaceae species Zea mays, Sorghum bicolor, and Brachypodium distachyon. The highest number of hits was detected for the Clipper and Sahara genotypes against Hordeum vulgare cvs Barke, Bowman, and Morex genomes (Table 2). The quality of the annotation was supported by the observation that 70–95.8% of the ab initio predicted protein sequences of Clipper and Sahara had at least one ortholog in any of the three Poaceae genomes present in Phytozome (Zea mays, Sorghum bicolor, or Brachypodium distachyon) (Table 2).
Validation of the presence of key salt-responsive genes
To further support the transcriptome completeness of the de novo assembly, we identified four putative Oryza sativa High Affinity Potassium Transporter (OsHKT1.5, OsHKT2.1, OsHKT2.3, and OsHKT2.4) orthologs sharing 62.35–72.09%, and four putative Arabidopsis thaliana Salt Overly Sensitive (AtSOS1, AtSOS2, AtSOS3, and AtSOS4) orthologs sharing 60.87–78.25% sequence identity in the cv. Clipper and LR Sahara assemblies that were differentially expressed along the root zones (Supplemental Figs S1 and S2).
In rice, the stelar Na+ transporter encoding gene OsHKT1.5 was found to be expressed at higher levels after salt treatment in salt-tolerant rice lines25. Here, both OsHKT1.5 and OsHKT2.4 orthologs found in cv. Clipper and LR Sahara were only expressed after salinity stress (mature zones), and were not detectable under control conditions (Supplemental Fig. S1). OsHKT2.3 detected in Clipper showed increased expression in the elongation zones both before and after salinity stress, and in the maturation zone exclusively after salinity stress, whereas OsHKT2.1 showed higher transcript levels before salinity stress particularly in the elongation and maturation zones, consistent with both microarray and quantitative PCR results reported in ref. 25.
In Arabidopsis, the SOS2 gene is required for intracellular Na+ and K+ homeostasis, and although expression is present in the root without treatment, SOS2 expression is up-regulated by salt stress26. AtSOS2 orthologs (with similar trends for AtSOS4) detected in both cv. Clipper and LR Sahara were found at increased levels in the elongation zone under both conditions, and at increased levels after salinity in the maturation zone (Supplemental Fig. S2). By contrast, AtSOS1 and AtSOS3 orthologs were expressed more evenly across the root zones. Sahara expressed all four detected AtSOS orthologs at higher levels compared to Clipper.
Gene ontology analysis of barley transcripts
The assembled barley transcriptomes were annotated using BLASTP to UniProtKB/SwissProt, and BLASTX to UniProtKB/TrEMBL entries pre-loaded into Trinotate (https://trinotate.github.io/). For both genotypes, we were able to annotate two thirds of the assembled transcriptome: BLASTP yielded 76,269 (cv. Clipper) and 76,276 (LR Sahara) hits, and BLASTX resulted in 71,936 (cv. Clipper) and 72,451 (LR Sahara) hits, with 88,831 (cv. Clipper) and 88,017 (LR Sahara) hits combined. To investigate the biological role of genes regulated by salinity stress in both Clipper and Sahara, we tested for enriched GO (Gene Ontology) terms for both transcriptomes for differentially expressed (DE) contigs (log2-fold change > 1, FDR < 0.05), which provide a detailed annotation of gene function, biological process and cellular location of the gene product (Table 3; Supplemental Figs S3 and S4, cv. Clipper and LR Sahara, respectively).
GO analysis indicated that a total of 24,670 Clipper contigs were associated with biological process GO annotations, of which about one-third (8,919) were differentially expressed (DE) in one or more comparisons (see Methods; Supplemental Fig. S3). For Sahara, 24,825 contigs had biological process GO annotations, of which 9,268 contigs were considered DE in one or more comparison (Supplemental Fig. S4). The molecular function and cellular component GO annotations contained a similar number of GO terms as well as contigs associated with these processes which were significantly differentially expressed. GO analysis indicated that a total of 9,286 and 8,600 Clipper contigs were associated with molecular function or cellular component GO annotations and were DE in one or more comparisons (see Methods; Supplemental Fig. S3). For Sahara, 9,794 and 8,931 contigs had molecular function or cellular component GO annotations, respectively, and were considered DE in one or more comparisons (Supplemental Fig. S4).
Enriched gene ontology term annotations were identified using AgriGO (http://bioinfo.cau.edu.cn/agriGO/) enrichment analysis supplemented by REVIGO (http://revigo.irb.hr/) visualization toolbox contigs after salinity stress for both Clipper (Fig. 3A) and Sahara (Fig. 3B). Differentially expressed contigs with biological process GO annotations were used as the query list as they were the best represented and the complete set of contigs with GO annotations (biological process, molecular function, and cellular component) was used as the background.
Statistically enriched terms gave insights into biological pathways that were likely to be highly active by comparing them to the frequency at which those GO terms appear in the transcriptome. Metabolic process, cellular process, biosynthetic process, primary metabolic process, and translation were the five most significantly enriched GO terms in Clipper after salinity stress; 54 GO terms were reported as statistically significant (p < 0.05) for enrichment in the Plant Slim GO analysis with AgriGO when comparing control versus salinity (Supplemental Table S3). In Sahara, a total of 55 biological processes were reported as statistically significant enriched biological processes, with response to stimulus, response to stress, cellular process, metabolic process, and translation the five most significant GO terms (Supplemental Table S4).
Visualization of spatial gene expression profiles in barley roots
To highlight genes that have changed significantly in abundance along the different zones (Z1, Z2, Z3) of the root in both barley genotypes before and after salinity stress, we used a number of generalized linear models (GLM) to test for multiple interactions (treatment x zones) for both genotypes separately. Table 4 summarizes the GLM analysis, and Table 5 lists the top-ranked differentially expressed contigs after salinity stress per root zone with putative annotation. The full GLM data set is provided in Supplemental Data Sets S1 and S1, and DE gene annotation for cv. Clipper and LR Sahara, respectively, are provided in Supplemental Data Sets S3 and S4.
Surprisingly, only relatively few DE genes were detected as a response to treatment (100 mM NaCl) within individual root sections: In the meristematic zone (Z1), only 94 contigs (cv. Clipper) and 1,482 contigs (LR Sahara), respectively, were differentially expressed after salinity stress, with 8 and 24 times as many contigs up- than down-regulated in Clipper and Sahara, respectively. Examples include Dehydrin 5 (+5.3 log2 fold, cv. Clipper), a dehydration-inducible protein, and the sugar transporter SWEET (+8 log2 fold, LR Sahara). By contrast, in the elongation zone (Z2), 838 DE contigs (cv. Clipper) and 623 contigs (LR Sahara) were differentially expressed, with three times more genes up-than down-regulated in Sahara. Laccase, strongly up-regulated in the root tip in Clipper after salinity stress (+2.8 log2 fold), was strongly down-regulated in the elongation zone by −6 (log2)-fold. In addition, BBM2, a transcription factor that promotes cell proliferation, differentiation and morphogenesis27) (+5.3 log2 fold, cv. Clipper), and Pathogenesis-related Protein (+6.9 log2 fold, LR Sahara) were found to be strongly up-regulated after salinity stress in the elongation zone.
The largest treatment-specific transcriptional responses were observed in the maturation zone (Z3), with many more genes showing a significantly lower expression in this root zone after salinity treatment in both genotypes (cv. Clipper: DE contigs: 220 up and 1,340 down; LR Sahara: DE contigs: 676 up and 1,696 down). In both Clipper and Sahara, Cellulose synthase-like CslF9 transcripts were present at significantly lower levels in the maturation root zone after salinity stress (−6.5 and −5.7 log2 fold, respectively). Strikingly, contigs putatively annotated as peroxidases were expressed at much higher levels after salinity in the maturation root zone of LR Sahara (+5 log2 fold) compared to cv. Clipper (−6.9 log2 fold).
The largest transcriptional differences in terms of magnitude and quantity were observed between the different root zones: More than 18,000 contigs were differentially expressed in both genotypes in the elongation zone (Z2) and maturation zone (Z3) compared to the meristematic zone (Z1), with similar numbers for both control and salinity conditions (Table 4). To investigate these spatial transcriptional differences in more detail, we identified differentially expressed contigs for each genotype with GO annotation (biological process), and putatively annotated them using Trinotate (Fig. 4). We detected a total of 712 unique GO terms for transcripts from cv. Clipper, 157 GO terms for LR Sahara, and detected a further 1,303 GO terms shared by DE transcripts from both genotypes. GO term clusters consist of contigs found to be differentially expressed in any statistical comparison (treatments, genotypes, and root zones; for details see Methods), and are reported by the number of contigs that are differentially expressed >2-fold and are annotated with the given GO term. As the total number of computed clusters exceeded 3,400 for both genotypes, only the most highly expressed contigs of clusters that contained the largest number of contigs are presented in Fig. 4. Clusters are further grouped based on number of contigs and highest mean Trimmed mean of M-values (TMM)-normalized FPKM values present in each of the zones: meristematic zone (Z1), elongation zone (Z2), or maturation zone (Z3), respectively (Fig. 4A, see also Methods). GO term-enriched clusters that were exclusively found in cv. Clipper (Fig. 4B), LR Sahara (Fig. 4C), and in both genotypes (4D) are visualized separately.
As expected, expression profiles differ between root zones reflecting their biological function, and also show differences between the two barley genotypes. In cv. Clipper, DE transcripts of the meristematic zone (Z1) are enriched in GO terms related to the sugar signaling pathway, whereas DE transcripts of both the elongation (Z2) and maturation zones (Z3) in cv. Clipper are enriched in GO terms related to cell wall organization and regulation of circadian rhythm, respectively (Fig. 4B). GO terms related to cell wall organization contained 198 transcripts that were putatively identified as cellulose synthase-like (CslF) and cellulose synthase catalytic subunit (CesA9) genes, which are part of the cellulose synthase superfamily involved in the biosynthesis of mixed linkage β-(1,3; 1,4)-glucans and cellulose (a β-(1,4)-glucan), respectively28, and highly expressed in the elongation zone of cv. Clipper both before and after salinity stress. In both the maturation and elongation zone, several peroxidases related to regulation of circadian rhythm show spatial differential expression, with several also showing expression increases in response to salt treatment.
In LR Sahara, the cellular metabolic process GO term cluster includes 33 transcripts with many of them showing the highest expression in the root tip, including transcripts encoding proteins involved in cell wall polysaccharide metabolism (UDP-D-apiose/UDP-D-xylose synthase, UDP-glucuronic acid decarboxylase) (Fig. 4C). By contrast, lignin (Cinnamoyl-CoA reductase) and flavonoid (anthocyanidin reductase) biosynthesis transcript abundance is very low in the root tip, and increase strongly towards the elongation and maturation zone. Transcripts related to cell aging were found to encode lipase 3, several proteases showing different spatial expression patterns, as well as the serine/threonine protein kinase SAPK10, and Arabidopsis thaliana Salt- and Drought-Induced Ring Finger 1 (SDIR1) were present at high levels in both the elongation and maturation zones. By contrast, transcripts involved in osmotic stress-responsive abscisic acid signaling such as Tetratricopeptide Repeat-Containing Protein (TTL1) were highly expressed only in the elongation zone29,30,31.
Most GO term assignments (1,303) contained transcripts from both cv. Clipper and LR Sahara (Fig. 4D). Transcripts assigned to the GO term metabolic process were particularly highly expressed in the root tip of cv. Clipper and the elongation zone of Sahara, including pyruvate dehydrogenase and 3-ketoacyl-CoA thiolase 2. Transcripts related to protein phosphorylation were present in both cv. Clipper and LR Sahara, and proteins involved in stress resistance such as RPM1-interacting protein 4 (RIN4) and mitogen-activated protein kinase (MAPK1) showed similar expression patterns in the root of both genotypes. In the maturation zone, defense-related transcripts were widespread, including: pathogenesis-related protein 10 (PR10), patatin, and Xylanase inhibiting protein XIP1 shown to be specifically induced in roots by biotic and abiotic stresses32,33,34.
In this study we generated novel root transcriptomic resources obtained from RNA of three different root zones, and created two de novo assemblies of a barley landrace and a malting cultivar with known contrasting responses to salinity. We have detected root-zone specific transcriptional signatures and dominant expression patterns along longitudinal axes of roots under salinity that often coincided with defined developmental zones and were distinct between cv. Clipper and LR Sahara.
Salinity has a negative impact on root growth in many plant species, including barley which is the most salt-tolerant cereal crop35 and thus a good model to study salt-tolerance mechanisms. The early phase of salinity stress is described as osmotic stress which leads to loss of turgor due to cell dehydration, as well as reduced rates of root cell division and expansion36. The barley malting cv. Clipper and the North African LR Sahara have contrasting root growth phenotypes in response to the early phase of salinity stress: In a previous study, the length of the cell division region in the meristematic zone was unaffected by salt treatment, and the cortical cells in the elongation zone continued to expand in the more salt-tolerant Clipper37. In addition, Clipper was shown to accumulate amino acids, sugars, and organic acids in a root-zone-specific manner hypothesized to be contributing to elongation in response to salt stress: In the meristematic zone, high levels of amino acids, but low levels of metabolites of the energy metabolism (TCA cycle and sugars) where detected, whereas both the elongation and maturation zones accumulated amino acids, sugars, and organic acids after salt treatment.
In our study we detected transcriptomics responses that reflect these physiological and metabolic changes. Similar to the previously observed metabolic response, the transcriptional expression gradient along the root in this study also shows more distinct responses for the meristematic zone than for the other two zones (Table 4). The differential gene expression analysis demonstrates that in cv. Clipper, several highly expressed genes at the root tip were associated with sugar-mediated signaling and osmotic stress response (Fig. 4). After salinity stress, Dehydrin 5, a protein involved in frost tolerance in barley38 and also enhancing tolerance to salt and osmotic stress in Arabidopsis thaliana39, showed an up-regulation of +5.3 (equivalent to +40 fold) (Table 5). Its high transcript abundance in the more salinity-tolerant genotype Clipper points towards a protective role against salinity stress. The elongation zone has the highest expression of genes involved in cell wall organization and protein phosphorylation: The AP2-like ethylene-responsive transcription factor BABY BOOM (BBM) 2 and agglutinin isolectin 2 were strongly up-regulated in the elongation zone after salinity stress in cv. Clipper, but both were strongly down-regulated at similar magnitude in the same zone in LR Sahara (Table 5). Members of the AP2 family of transcription factors such as BBM play important roles in cell proliferation and embryogenesis in Arabidopsis thaliana40, showing that pathways were activated in response to salinity to maintain cell expansion, a result in line with the higher root elongation rate in Clipper compared to Sahara15. Plant lectins are carbohydrate-binding glycoproteins that bind with free sugar or with sugar residues, which would be present at high levels in the elongation zone in Clipper after salinity stress37, and are known to play important roles in plant defense41 and are hypothesized to be induced by abiotic stresses42. Genotype-specific transcriptional changes were also present in both the maturation and elongation zones, such as several peroxidase transcripts which were found at increased levels after salinity stress in cv. Clipper, suggesting that Clipper is capable of up-regulating its root antioxidant system in response to salinity stress (Fig. 4).
In stark contrast to cv. Clipper, which is able to maintain root elongation, cell division and overall root growth, Sahara’s root growth is severely limited under short-term salt stress15. Under short-term stress conditions, cell division is inhibited in the root tip, whereas cell expansion increases in the elongation zone, but overall, root growth is more inhibited in Sahara compared to Clipper37. In addition, Sahara showed lower levels of amino acids, sugars, and organic acids in both the root tip and the elongation zone, and higher levels in the maturation zone similar to Clipper in response to salinity stress37. This was reflected in the transcriptomic response as many more transcriptional changes after salinity stress were detected in the meristematic zone of Sahara compared to Clipper, indicating a much stronger impact of salt on Sahara roots compared to Clipper (Table 4). Gene expression analysis showed that highly expressed genes at the root tip are associated with cellular metabolic processes in LR Sahara. As an example, the sugar transporter SWEET showed an upregulation of +8-log2-fold (equivalent to +256-fold) indicating the activation of sugar signaling pathways that interact with stress pathways to modulate carbohydrate metabolism, possibly a response to low sugar levels in the root tip after salinity stress (Table 5). The transcriptomic profile confirms that Sahara is more affected by salt in the initial stages of root growth, resulting in up-regulation of cell senescence-related transcripts, such as proteases and lipases in the elongation zone. The adaptive process to abiotic stress is primarily regulated by ABA, and the greater impact of salinity on the sensitive LR Sahara compared to cv. Clipper was also reflected by the high level of differentially expressed transcripts related to ABA signaling in the maturation zone of Sahara (Fig. 4). Transcripts involved in plant defense such as patatin, a storage glycoprotein with lipase activity found in Solanum tuberosum43, were common between both genotypes and expressed in the elongation and maturation zones after salinity stress.
Transcriptomic analyses provide powerful tools to investigate the molecular basis of plant responses to stress and previous analyses in Arabidopsis thaliana and rice highlighted significant differences in transcript abundance and identity between different cell types of roots using microarray analyses6,44. Our findings emphasize that the different barley root zones are highly specialized in their biological function and although this is based on RNA expression, recent studies on spatially resolved root metabolism45,46 and proteins47 support this notion. Using this comprehensive root transcriptome dataset, we have demonstrated that transcriptional differences in different root zones within a species are much larger than between treatments. When either whole organs or entire plants are analyzed, the results provide only averaged responses that may be highly variable across different regions of the plant. This is of particular concern when investigating salinity stress at the seedling stage because of the developmental gradient that occurs within growing regions of the roots. Further studies will determine if spatially defined alternative splicing events, such as differential intron retention, are induced by salinity stress as part of a root-zone-specific adaptation response to salinity in barley.
Taken together, our study provides detailed insights into the barley root transcriptome that has allowed us to identify and annotate transcripts associated with specialized development and salinity stress response functions in different developmental root zones in two distinct barley varieties. The availability of additional transcriptome reference sets from a barley landrace and a malting cultivar as presented here will serve as valuable resources for plant breeders and to the research community for further functional and comparative genomics studies.
Genotypes of barley (Hordeum vulgare L.) were originally sourced from the Australian Centre for Plant Functional Genomics (University of Adelaide). Two genotypes of barley, the malting variety Clipper (Australia) and the landrace Sahara 3771 (North Africa), were used for RNA-Seq analysis, and were selected based on previously reported physiological diversity in salt tolerance13,15. These genotypes are parents of a mapping populations (Clipper x Sahara 377148), and genetic resources are available for further genetic characterisation.
Sample preparation, RNA isolation, and Illumina sequencing
Uniformly sized seeds were selected, surface-sterilised, and grown under control (nutrient medium without additional NaCl) and salt-treated (nutrient medium supplemented with 100 mM NaCl) conditions as described in ref. 15. After three days of germination, seminal roots were dissected, collected into 1.5 mL tubes, immediately snap-frozen in liquid nitrogen, and then stored at −80 °C. Samples were dissected in the following steps: A 1.5 mm long section marked ‘Zone 1’ (meristematic zone) was taken from the root tip. The second section (‘Zone 2’) was dissected from the elongation zone up to the third section, ‘Zone 3’ (maturation zone), which was excised at the point of visible root hair elongation up to ¾ of the entire root. Four biological replicates were generated for each sample in four separate experiments totaling 48 samples. Total RNA was isolated from 50 mg root tissue using the Qiagen RNeasy kit following the manufacturer’s protocol. The RNA was analyzed for quality and concentration using a DeNovix DS-11 spectrophotometer (Wilmington, DE, USA) and an Agilent Technologies 2100 Bioanalyzer. Libraries were amplified through 13 cycles of PCR using Illumina guidelines. All RNA-seq libraries were constructed and paired-end sequenced (100 bp) on an Illumina HiSeq 2000 system at the Australian Genome Research Facility (Melbourne, Australia). Four lanes were used for each genotype, and all 48 samples were run on a single flow cell. The Illumina TruSeq RNA Sample preparation kit v2 (Illumina Inc.) was used according to the manufacturer’s protocol. Image analysis was performed in real time by the HiSeq Control Software (HCS) v1.4.8 and Real Time Analysis (RTA) v188.8.131.52, running on the instrument computer. The RNA was sequenced to a depth of approximately 31 million read-pairs per sample per lane, giving a total of 1.48 billion reads (749 million read-pairs), yielding 296.06 Gb data. The Illumina CASAVA (Consensus Assessment of Sequence And Variation) 1.8 pipeline was used to generate the sequence data.
Pre-processing of reads
All read pairs were trimmed for quality using Nesoni software (https://github.com/Victorian-Bioinformatics-Consortium/nesoni; q = 20, Illumina TruSeq Adaptor primers removed, min length 30 bp, singletons discarded), and technical replicates of each condition in four lanes were pooled to form a single dataset per experimental condition. FastQC software (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/) and manual inspection of each sample were used to assess the quality of raw sequence data and results of quality trimming.
De novo transcriptome assembly
Data redundancy and error reduction was performed using digital normalization (Diginorm), which identified approximately 80% of the reads49 as redundant and which were subsequently excluded from the assembly. All biological replicates (per genotype) were pooled into one paired-end read set for independent de novo assemblies per genotype. De novo transcriptome assembly was performed at the Victorian Life Sciences Computation Initiative (VLSCI, University of Melbourne) using 64 GB memory and 16 cores, using the Trinity (v20140717) assembler available at http://trinityrnaseq.github.io/ 16, with k = 25.
To assess the completeness of each de novo assembled transcriptome, the Core Eukaryotic Genes Mapping Approach (CEGMA) program v2.4.01031218 was used. The most likely longest ORF (open reading frame) peptide candidates were extracted from each transcriptome using TransDecoder with default parameters (http://transdecoder.github.io).
Annotation of each transcriptome was performed using the Trinotate pipeline (v2.0.1, http://trinotate.github.io/) including all programs except RNAmmer. BLASTP results to the SwissProt database, and BLASTX results to the UniRef90 database (www.uniprot.org 50) were obtained for all contigs for both transcriptomes. To obtain GO (Gene Ontology) annotations for both transcriptomes, we used the Trinotate-integrated UniProtKB GO annotations.
Quantitation of gene expression
Each sample was individually aligned using Bowtie v2.1.051 to its respective transcriptome with TPM (transcripts per million), FPKM (Fragments Per Kilobase Of Exon Per Million Fragments Mapped), credibility intervals and expected counts produced using RSEM v1.2.1952 via the align_and_estimate.pl perl script, which is part of the Trinity software suite. Both de novo trinity component (“gene”) and transcript contig expression matrices were computed. Cross-sample normalization of FPKM values was carried out using Trimmed mean of M-values (TMM) normalization available in the edgeR software package (version v3.8.653,54.
Identification of outlier samples
Dimensionality reduction to two dimensions using MDS (multidimensional scaling) of the TMM-normalised FPKM matrix was then performed to observe the similarity between samples within each genotype (data not shown). This revealed that one biological replicate sample of Clipper control maturation zone was significantly different to other replicates of the same condition, therefore was excluded from all subsequent analyses.
Identification of differentially expressed genes
Transcript isoform-level expected counts matrices for each genotype were obtained from RSEM and passed to R v3.1.155 and edgeR (version v3.8.653,54) to be used for differential expression analysis. Only matrix rows containing four or more replicates with counts per million (CPM) greater than three were retained for the analysis. We conducted generalized linear model (GLM) analyses for each genotype, and used a number of models for multiple comparisons between treatments, genotypes, and root zones: (i) a basic model treating treatment and zone as separate variables; (ii) an interaction model considering the interaction of zone within treatment, and (iii) an interaction model considering the interaction of treatment within zone. Common-, trended-, and tagwise-dispersion estimates were calculated for each model, and a Benjamini-Hochberg false discovery rate (FDR) of 0.05 or less and an absolute log2-fold change greater than 1 was required to define a gene as differentially expressed. We visualized the TMM-normalised FPKM values of differentially expressed contigs for each cultivar with GO annotation using ggplot256 as provided in Fig. 4 of the Results section and Supplemental Data Sets 3 and 4 to identify contigs from GO terms that warrant further investigation.
Gene Ontology and Pathway enrichment analysis
Singular enrichment analysis (SEA) on the GLM results was performed using AgriGO (http://bioinfo.cau.edu.cn/agriGO/analysis.php) for each genotype separately, using results from the GLM differential expression analysis. For each separate genotype analysis, we employed the following AgriGO SEA parameter settings: hypergeometric test, with Yekutieli multi-testing adjustment, 0.05 significance level, 5 minimum mapping entries and Plant Slim GO and biological process gene ontology. REVIGO57; http://revigo.irb.hr/, default settings except small results requested) was applied to visualize the summary results and generate Fig. 3.
Identification of orthologous genes
We used OrthoFinder23; available from http://www.stevekellylab.com/software/orthofinder, v0.2.8) using default parameters to identify orthologous gene groups from 15 angiosperms; these included five Hordeum genotypes (Clipper, Sahara and independently sequenced Barke, Morex and Bowman) along with key angiosperms (such as Amborella trichopoda, Arabidodopsis thaliana and Oryza sativa) obtained from Phytozome (www.phytozome.net version 10.2, accessed May 2015; 25) to enable comparison of the gene complement at key points in evolution. Augustus v2.5.558 was used to construct ab initio predicted proteomes unless an annotated genome prediction was already available, with a number of plant species gene models including Arabidopsis thaliana, Oryza sativa, and Zea mays used as references. Pairwise species blast was performed using the BLAST v2.2.29+ downloaded from NCBI (National Centre for Biotechnology Information, http://www.ncbi.nlm.nih.gov/).
How to cite this article: Hill, C. B. et al. De novo transcriptome assembly and analysis of differentially expressed genes of two barley genotypes reveal root-zone-specific responses to salt exposure. Sci. Rep. 6, 31558; doi: 10.1038/srep31558 (2016).
The authors would like to thank Dr. Stuart Roy (Australian Centre for Plant Functional Genomics, The University of Adelaide) for providing the barley seeds, and staff at the Australian Genome Research Facility (AGRF, Melbourne) for next-generation sequencing. U.R. and C.B.H. were funded through an Australian Research Council (ARC) Future Fellowship program. This research was funded by grants through the ARC, The Grains Research and Development Corporation (GRDC), The South Australian Government, The University of Adelaide, The University of Queensland and The University of Melbourne. The ARC Centre of Excellence in Plant Cell Walls also acknowledges the support of funds from the Australian Government’s National Collaborative Research Infrastructure Program (NCRIS) administered through Bioplatforms Australia (BPA) Ltd. This research was supported by two Victorian Life Sciences Computation Initiative (VLSCI) grants (numbers VR1534 and VR2498) on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government, Australia, and by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.