Transcriptome profiling of abiotic responses to heat, cold, salt, and osmotic stress of Capsicum annuum L.

Kang, Won-Hee; Sim, Young Mi; Koo, Namjin; Nam, Jae-Young; Lee, Junesung; Kim, Nayoung; Jang, Hakgi; Kim, Yong-Min; Yeom, Seon-In

doi:10.1038/s41597-020-0352-7

Download PDF

Data Descriptor
Open access
Published: 13 January 2020

Transcriptome profiling of abiotic responses to heat, cold, salt, and osmotic stress of Capsicum annuum L.

Scientific Data volume 7, Article number: 17 (2020) Cite this article

10k Accesses
73 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Peppers (Capsicum annuum L.), belonging to the Solanaceae family, are one of the most economically important crops globally. Like other crops, peppers are threatened by diverse environmental conditions due to different pathogens and abiotic stresses. High-quality reference genomes with massive datasets of transcriptomes from various conditions can provide clues to preferred agronomic traits for breeding. However, few global gene expression profiling datasets have been published to examine the environmental stress-resistant mechanisms in peppers. In this study, we report the RNA-seq analyses of peppers treated with heat, cold, salinity, and osmotic stress at six different time points. RNA-seq libraries from 78 RNA samples containing three biological replicates per time point for each of the abiotic stresses and a mock control were constructed. A total of 204.68 Gb of transcriptome data were verified by differentially expressed genes and gene ontology enrichment analysis. Analyses of the transcriptome data in this study will provide useful information for basic studies of various stimuli to facilitate the development of stress-resistant pepper cultivars.

Measurement(s)	RNA • transcriptome
Technology Type(s)	RNA sequencing
Factor Type(s)	time point measurement
Sample Characteristic - Organism	Capsicum annuum

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.11440659

Single-cell and spatial RNA sequencing reveal the spatiotemporal trajectories of fruit senescence

Article Open access 10 April 2024

Differential selection of yield and quality traits has shaped genomic signatures of cowpea domestication and improvement

Article 22 April 2024

Spatial co-transcriptomics reveals discrete stages of the arbuscular mycorrhizal symbiosis

Article Open access 08 April 2024

Background & Summary

Abiotic stresses, such as heat, cold, drought, and salinity, which affect the condition of the soil can decrease crop quality, reduce crop production, and threaten food security. Plants respond to abiotic stresses via dynamic and complex reactions that accompany molecular, cellular, and physiological changes in plant tissues¹. To understand the responses of plants to abiotic stresses, diverse crop breeding approaches have been applied from traditional breeding methods to variable -omics methods, such as next-generation sequencing (NGS).

Since the development of NGS, the transcriptome has been widely studied to gain insights into the molecular mechanisms by which plant species adapt to their environment. Currently, transcriptome data analyses of plants are performed in various organisms under diverse conditions, including exposure to abiotic stresses. Most transcriptome studies involving abiotic stresses have been performed in model plants, with a few studies examining crops treated with one or two different stresses at a certain plant development stage^{2,3,4,5,6,7,8}. Therefore, limited comparative transcriptome analyses for plants responding to different abiotic stresses have been performed.

Peppers (Capsicum spp.) are an economically important crop belonging to the Solanaceae family. Over the last two decades, the production and cultivation of chili peppers have steadily increased worldwide, reaching 3.8 million ha of land for cultivation and 40.7 million tons of peppers produced in 2017 (FAO; www.fao.org). Recently, multiple reference pepper genomes and transcriptomes have been published; these datasets can be used to obtain abundant information on pepper breeding traits^9,10,11. However, few studies have examined pepper breeding in diverse conditions, such as during exposure to abiotic and biotic stresses. Comprehensive transcriptome analyses under such diverse conditions are required to obtain a wide variety of gene expression profiles and identify complex gene expression networks.

In this study, we present transcriptome analyses of peppers subjected to four major environment stresses—heat, cold, drought, and salinity—at the same time points and at the same plant stages. We describe in detail the construction of 78 RNA-seq libraries for heat-, cold-, mannitol-, and NaCl-treated and untreated control samples at 0, 3, 6, 12, 24, and 72 h. A total of 204.68 Gb of transcriptome data were generated using transcriptome analysis pipelines consisting of quality control, quantification, and differential gene expression analyses. A principal components analysis (PCA) test, hierarchical clustering of gene expression data and gene ontology (GO) enrichment analysis were used to infer the quality of the RNA-seq data and the characteristics of samples in each treatment. The extensive transcriptome data obtained will provide valuable information for future studies of crops exposed to abiotic stresses.

Methods

Overview of experimental design

The third or fourth leaves were collected from four pepper plants per each biological replicate. Leaves were harvested at 3, 6, 12, 24, and 72 h after treatment. Mock controls were simultaneously collected with each abiotic treatment sample at 0, 3, 6, 12, 24, and 72 h. Marker gene expression for each condition was confirmed for 78 RNA samples by RT-PCR (Fig. 1). Subsequently, RNA-seq libraries were constructed and sequenced. Transcriptome data were used to conduct a quality assessment and aligned to Capsicum annuum cv. CM334 reference genome (v.1.6). The workflows for the abiotic stress treatment and transcriptome data analysis pipeline are presented in Fig. 1.

Plant materials and treatment

Two weeks after germination, the pepper seedlings were transplanted into a 32-plug tray (6 cm in diameter by 6.5 cm in height) and maintained in a growth room at 24 ± 1 °C with a 16-h light and 8-h dark photoperiod. At the six-true-leaf stage, plants were subjected to a temperature of 10 °C or 40 °C to mimic cold or heat stress, respectively. For salinity stress, plants were treated with 50 mL of a 400 mM NaCl solution; for osmotic stress, the peppers were treated with 50 mL of 400 mM mannitol. For transcriptome profiling, the third or fourth leaves from four plants were harvested per replicate at 0, 3, 6, 12, 24, and 72 h after treatment (Fig. 1). Three biological replicates at each time point per condition were collected. The leaf samples were flash-frozen in liquid nitrogen and stored at −80 °C until RNA isolation.

RNA extraction, library construction, and sequencing

Total RNA was extracted from pepper leaf samples (100 mg) using the Trizol reagent (Ambion, USA), according to the manufacturer’s instructions. To perform RNA quality control, RNA was quantified spectrophotometrically using a NanoDrop 2000 spectrophotometer (Thermo Scientific, USA), and RNA integrity was verified by agarose gel electrophoresis. Marker gene expression for each treatment was confirmed by RT-PCR analysis using primers specific for each marker gene: heat stress (CaWRKY)¹², cold stress (CaDhn)¹³, salinity stress (CaPR10)¹⁴, and osmotic stress (CaDhn)¹³ (Fig. 1). RT-PCR was performed using a GeneAtlas thermo-cycler G-02 device (Astec, Japan) using rTaq DNA polymerase (Elpis, Korea) as described by the manufacturer. The gene expression level was normalized to the expression of the CaActin gene and was calculated relative to mock control. Values were calculated following three replications with standard deviations (Fig. 1). Five micrograms of each RNA sample were used to generate a strand-specific library containing inserts of approximately 150–200 bp in size, as previously described¹⁵. In total, 78 cDNA libraries from five treatments (i.e., the four abiotic stresses and a mock control) were constructed for transcriptome analysis (Table 1). For RNA sequencing, 150-nt, paired-end sequencing was conducted using a HiSeq 2500 platform (Illumina, USA) at Macrogen (Korea).

Table 1 Statistical summary of RNA-seq data used in this study.

Full size table

Data preprocessing, gene quantification, and GO enrichment analysis

The raw RNA sequences were filtered and trimmed using cutadapt¹⁶ and the NGS QC Toolkit¹⁷ to remove low-quality bases and adapter sequences. After filtering, the trimmed reads were assessed using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and quality results were then merged with multiQC¹⁸ using default parameters. The preprocessed reads were aligned to C. annuum v.1.6 reference genome (GenBank Accessions: AYRZ00000000)¹⁹ and the annotation gene model v.2.0 (http://peppergenome.snu.ac.kr) using Hisat v2-2.1.0 software²⁰ and default parameters. Transcriptome quantification was performed using featureCounts²¹ to calculate the transcript read counts. Raw read counts were normalized using TMM methods²². PCA was performed using previously published code with modification²³.

We used the edgeR with three package (estimateDisp, glmQLFit, and glmQLFTest) to conduct differential expression analysis²⁴. To design a model formula, all the experimental factors such as time points and treatments were combined into one factor. Then, we found genes that respond differently between the treatment and the mock at any time points. Genes with an adjusted p-value < 0.05 and fold change (FC) > |2| were considered differentially expressed genes (DEGs). The DEGs for each point in treatments were collected to union sets of the DE gene across the time points. The top 30 genes of DEGs for each stress were displayed into heatmap using pheatmap package²⁵. GO enrichment analysis for DEGs were performed based on the functional annotation of Arabidopsis genome (TAIR10, http://www.arabidopsis.org). The best hit proteins were mapped by the best match of BLASTP with filtering category (query coverage ≥60%, subject coverage ≥60% and identity ≥60%). The GO enrichment analysis for best hit proteins was performed using clusterProfiler²⁶ in the R package with org.At.tair.db for Arabidopsis annotation package²⁷ (See the file “Programs and code information.docx” at figshare). The enriched GO terms were obtained by FDR < 0.05 and minGSize = 10. Then, the stress-related GO terms were visualized using ggplot2²⁸. All expression profiling, DEG information and the results of GO enrichment analysis are available at figshare²⁷ (see the files “Normalized TMM.zip”, “DEG result for each stress.zip” and “GO enrichment analysis result.zip” at figshare).

Data Records

The RNA-seq raw data of 78 samples are deposited at the NCBI Sequence Read Archive (SRA) with identifier SRP187794²⁹. The gene expression quantification data of all the samples was deposited at Gene Expression Omnibus (GEO) database with identifier³⁰. The combined additional files and information generating this study have been uploaded to figshare²⁷.

Technical Validation

Quality control

The quality of the RNA-seq data was assessed by investigating the mean quality score per position and per sequence, as well as the GC content and read length distribution using FastQC and multiQC¹⁸. The assessment plots are shown in Fig. 2. The quality scores of the bases per position were higher than the Phred quality score of 25, and all reads were greater than the quality score of 20. The GC content of all samples was shown as a normal distribution; these data indicate a lack of sequence contamination during the sequencing process. These statistics revealed that the raw reads were of high quality. Alignments of the preprocessed reads had a high mapping rate, which on average 70.14% and 90.38% in the gene model (v.2.0) and reference genome of C. annuum (v.1.6), respectively²⁷ (see the file “Statistical summary of RNA-seq for each sample.xlsx” at figshare).

Analysis of transcriptome data

To quantify global gene expression patterns for multiple abiotic stresses, the mapped reads were calculated into read counts for the individual pepper genes. The distributions of all samples for normalized read counts were compared and are shown as a boxplot in Fig. 3a. These distributions were similar between the samples. A PCA analysis revealed that the first two PCs explained most of the variance, and samples from each treatment belonged to the same cluster with similar patterns (Fig. 3b). We further investigated the global gene expression profiles by performing analyses of the DEGs related to each abiotic stress and compared the results to those of the mock control²⁷ (See the file “DEG result for each stress.zip” at figshare). As shown in Fig. 3c, the y-axis depicts the fold change in the log₂-transformed data, and the x-axis represents the log₂-transformed average counts per million reads (CPM). Upregulated DEGs are highlighted in red, whereas downregulated DEGs are shown in blue, with an adjusted p-value < 0.05 and FC > |2|. The gene expression patterns for the 30 top-ranked DEGs are shown in a heatmap (Fig. 4a). We identified a total of 12,494 DEGs shared and unique between stress treatments (Fig. 4b). To validate plant responses to each abiotic stress, GO enrichments were analyzed²⁷ (See the file “GO enrichment analysis result.zip” at figshare) and we represented stress-responsive GO enrichments showing conserved and unique GO terms by comparison of each treatment (Fig. 4c). The distinctive patterns of gene expression and GO enrichments suggest that these data would be useful for comparing changes in gene expression for other abiotic stresses.

Code availability

Codes that were used for the RNA-seq data processing are available at figshare²⁷. Software and their versions were described in Methods.

References

Zhu, J. K. Abiotic stress signaling and responses in plants. Cell 167, 313–324 (2016).
Article CAS Google Scholar
Coolen, S. et al. Transcriptome dynamics of Arabidopsis during sequential biotic and abiotic stresses. Plant J. 86, 249–267 (2016).
Article CAS Google Scholar
Rasmussen, S. et al. Transcriptome responses to combinations of stresses in Arabidopsis. Plant Physiol. 161, 1783–1794 (2013).
Article CAS Google Scholar
Cohen, S. P. & Leach, J. E. Abiotic and biotic stresses induce a core transcriptome response in rice. Sci. Rep. 9, 6273 (2019).
Article ADS Google Scholar
Ma, J. et al. Transcriptomics analyses reveal wheat responses to drought stress during reproductive stages under field conditions. Front. Plant Sci. 8, 592 (2017).
Article Google Scholar
Liu, Z. et al. Temporal transcriptome profiling reveals expression partitioning of homeologous genes contributing to heat and drought acclimation in wheat (Triticum aestivum L.). BMC Plant Biol. 15, 152 (2015).
Article Google Scholar
Gong, L. et al. Transcriptome profiling of the potato (Solanum tuberosum L.) plant under drought stress and water-stimulus conditions. PLoS One 10, e0128041 (2015).
Article Google Scholar
Kang, W. H. & Yeom, S. I. Genome-wide Identification, classification, and expression analysis of the receptor-like protein family in tomato. Plant Pathol. J. 34, 435–444 (2018).
Article CAS Google Scholar
Kim, S. et al. New reference genome sequences of hot pepper reveal the massive evolution of plant disease-resistance genes by retroduplication. Genome Biol. 18, 210 (2017).
Article Google Scholar
Kim, M. S. et al. Global gene expression profiling for fruit organs and pathogen infections in the pepper. Capsicum annuum L. Sci. Data 5, 180103 (2018).
Article CAS Google Scholar
Kang, W. H., Kim, S., Lee, H. A., Choi, D. & Yeom, S. I. Genome-wide analysis of Dof transcription factors reveals functional characteristics during development and response to biotic stresses in pepper. Sci. Rep. 6, 33332 (2016).
Article ADS Google Scholar
Dang, F. F. et al. CaWRKY40, a WRKY protein of pepper, plays an important role in the regulation of tolerance to heat stress and resistance to Ralstonia solanacearum infection. Plant Cell Environ. 36, 757–774 (2013).
Article CAS Google Scholar
Kim, S. Y. et al. Identification of a CaRAV1 possessing an AP2/ERF and B3 DNA-binding domain from pepper leaves infected with Xanthomonas axonopodis pv. glycines 8ra by differential display. Biochim. Biophys. Acta 1729, 141–146 (2005).
Article CAS Google Scholar
Park, C. J. et al. Pathogenesis-related protein 10 isolated from hot pepper functions as a ribonuclease in an antiviral pathway. Plant J. 37, 186–198 (2004).
Article CAS Google Scholar
Zhong, S. et al. High-throughput illumina strand-specific RNA sequencing library preparation. Cold Spring Harb. Protoc. 2011, 940–949 (2011).
Article Google Scholar
Martin, M. Cutadapt removes adapter sequences From high-throughput sequencing reads. EMBnet. Journal 17, 10–12 (2011).
Google Scholar
Patel, R. K. & Jain, M. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 7, e30619 (2012).
Article ADS CAS Google Scholar
Ewels, P., Magnusson, M., Lundin, S. & Kaller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Article CAS Google Scholar
Kim, S. et al. Capsicum annuum cultivar CM334, whole genome shotgun sequencing project. GenBank, https://identifiers.org/ncbi/insdc:AYRZ00000000 (2017).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Article CAS Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Article CAS Google Scholar
Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).
Article Google Scholar
Sollner, J. F. et al. An RNA-Seq atlas of gene expression in mouse and rat normal tissues. Sci. Data 4, 170185 (2017).
Article CAS Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
CAS Google Scholar
Kolde, R. pheatmap: Pretty Heatmaps, https://CRAN.R-project.org/package=pheatmap (2015).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. ClusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
Kang, W. H. et al. Transcriptome profiling of abiotic responses to heat, cold, salt, and osmotic stress of Capsicum annuum L. figshare. https://doi.org/10.6084/m9.figshare.10264832.v5 (2019).
Wickham, H. ggplot2: elegant graphics for data analysis. Springer-Verlag New York (2009).
NCBI Sequence Read Archive, https://identifiers.org/ncbi/insdc.sra:SRP187794 (2019).
NCBI Gene Expression Omnibus, https://identifiers.org/GEO:GSE132824 (2019).

Download references

Acknowledgements

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Government (NRF-2017R1E1A1A01072843 and NRF-2019R1C1C1007472) and the Korean Research Institute of Bioscience and Biotechnology initiative program. N.K. and J.L. were supported by a scholarship from the BK21 Plus Program from the Ministry of Education. We appreciate the assistance from the KOBIC Research Support Program.

Author information

These authors contributed equally: Won-Hee Kang and Young Mi Sim.

Authors and Affiliations

Institute of Agriculture & Life Science, Gyeongsang National University, Jinju, 52828, Republic of Korea
Won-Hee Kang, Jae-Young Nam & Seon-In Yeom
Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Republic of Korea
Young Mi Sim, Namjin Koo & Yong-Min Kim
Department of Agricultural Plant Science, Division of Applied Life Science (BK21 Plus Program), Gyeongsang National University, Jinju, 52828, Republic of Korea
Junesung Lee, Nayoung Kim, Hakgi Jang & Seon-In Yeom

Authors

Won-Hee Kang
View author publications
You can also search for this author in PubMed Google Scholar
Young Mi Sim
View author publications
You can also search for this author in PubMed Google Scholar
Namjin Koo
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Young Nam
View author publications
You can also search for this author in PubMed Google Scholar
Junesung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Nayoung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hakgi Jang
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Min Kim
View author publications
You can also search for this author in PubMed Google Scholar
Seon-In Yeom
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.-H.K. designed the experiments, and organized and wrote the manuscript. Y.M.S. and N.K. performed data collection, generated transcriptome data, and wrote the original manuscript draft. J.-Y.N., J.L., N.K. and H.J. collected samples and generated transcriptome data. Y.-M.K. designed the experiments, and organized and reviewed the manuscript. S.-I.Y. designed the experiments, organized and wrote the manuscript, and supervised the project.

Corresponding author

Correspondence to Seon-In Yeom.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Kang, WH., Sim, Y.M., Koo, N. et al. Transcriptome profiling of abiotic responses to heat, cold, salt, and osmotic stress of Capsicum annuum L.. Sci Data 7, 17 (2020). https://doi.org/10.1038/s41597-020-0352-7

Download citation

Received: 24 June 2019
Accepted: 17 December 2019
Published: 13 January 2020
DOI: https://doi.org/10.1038/s41597-020-0352-7

This article is cited by

Identification and expression analyses of B3 genes reveal lineage-specific evolution and potential roles of REM genes in pepper
- Young-Soo Park
- Hye Jeong Cho
- Seungill Kim
BMC Plant Biology (2024)
The landscape of abiotic and biotic stress-responsive splice variants with deep RNA-seq datasets in hot pepper
- Nayoung Kim
- Junesung Lee
- Won-Hee Kang
Scientific Data (2024)
Transcriptomics: illuminating the molecular landscape of vegetable crops: a review
- Puneeth Pattaduru Vishwanath
- Vishwanath Bidaramali
- Ramesh Kumar Yadav
Journal of Plant Biochemistry and Biotechnology (2024)
Comparative and expression analyses of AP2/ERF genes reveal copy number expansion and potential functions of ERF genes in Solanaceae
- Jin-Wook Choi
- Hyeon Ho Choi
- Seungill Kim
BMC Plant Biology (2023)
Global co-expression network for key factor selection on environmental stress RNA-seq dataset in Capsicum annuum
- Junesung Lee
- Seon-In Yeom
Scientific Data (2023)