Abstract
Large-scale sequencing of short mRNA-derived tags can establish the qualitative and quantitative characteristics of a complex transcriptome. We sequenced 12,304,362 tags from five diverse libraries of Arabidopsis thaliana using massively parallel signature sequencing (MPSS). A total of 48,572 distinct signatures, each representing a different transcript, were expressed at significant levels. These signatures were compared to the annotation of the A. thaliana genomic sequence; in the five libraries, this comparison yielded between 17,353 and 18,361 genes with sense expression, and between 5,487 and 8,729 genes with antisense expression. An additional 6,691 MPSS signatures mapped to unannotated regions of the genome. Expression was demonstrated for 1,168 genes for which expression data were previously unknown. Alternative polyadenylation was observed for more than 25% of A. thaliana genes transcribed in these libraries. The MPSS expression data suggest that the A. thaliana transcriptome is complex and contains many as-yet uncharacterized variants of normal coding transcripts.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Genome-wide identification and characterization of NBS-encoding genes in Raphanus sativus L. and their roles related to Fusarium oxysporum resistance
BMC Plant Biology Open Access 18 January 2021
-
Large scale study of anti-sense regulation by differential network analysis
BMC Systems Biology Open Access 20 November 2018
-
Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana
BMC Genomics Open Access 09 July 2015
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
References
Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).
Yamada, K. et al. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302, 842–846 (2003).
Haas, B.J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Andrews, J. et al. Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis. Genome Res. 10, 2030–2043 (2000).
Guigo, R., Agarwal, P., Abril, J.F., Burset, M. & Fickett, J.W. An assessment of gene prediction accuracy in large DNA sequences. Genome Res. 10, 1631–1642 (2000).
Haas, B.J. et al. Full-length messenger RNA sequences greatly improve genome annotation. Genome Biol. 3, published online 30 May 2002 (RESEARCH0029.21–0029.12, 2002).
Eddy, S.R. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919–929 (2001).
MacIntosh, G.C., Wilkerson, C. & Green, P.J. Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs. Plant Physiol. 127, 765–776 (2001).
Vanhee-Brossollet, C. & Vaquero, C. Do natural antisense transcripts make sense in eukaryotes? Gene 211, 1–9 (1998).
Wortman, J.R. et al. Annotation of the Arabidopsis genome. Plant Physiol. 132, 461–468 (2003).
Brenner, S. et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18, 630–634 (2000).
Brenner, S. et al. In vitro cloning of complex mixtures of DNA on microbeads: physical separation of differentially expressed cDNAs. Proc. Natl. Acad. Sci. USA 97, 1665–1670 (2000).
Adams, M.D. et al. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377, 3–174 (1995).
Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W. Serial analysis of gene expression. Science 270, 484–487 (1995).
Audic, S. & Claverie, J.M. The significance of digital gene expression profiles. Genome Res. 7, 986–995 (1997).
Hoth, S. et al. Monitoring genome-wide changes in gene expression in response to endogenous cytokinin reveals targets in Arabidopsis thaliana. FEBS Lett. 554, 373–380 (2003).
Hoth, S. et al. Genome-wide gene expression profiling in Arabidopsis thaliana reveals new targets of abscisic acid and largely impaired gene regulation in the abi1–1 mutant. J. Cell Sci. 115, 4891–4900 (2002).
Meyers, B.C., Morgante, M. & Michelmore, R.W. TIR-X and TIR-NBS proteins: two new families related to disease resistance TIR-NBS-LRR proteins encoded in Arabidopsis and other plant genomes. Plant J. 32, 77–92 (2002).
Meyers, B.C. et al. The use of MPSS for whole-genome transcriptional analysis in Arabidopsis. Genome Res. in the press (2004).
Lehner, B., Williams, G., Campbell, R.D. & Sanderson, C.M. Antisense transcripts in the human genome. Trends Genet. 18, 63–65 (2002).
Terryn, N. & Rouze, P. The sense of naturally transcribed antisense RNAs in plants. Trends Plant Sci. 5, 394–396 (2000).
Xiao, Y.L., Malik, M., Whitelaw, C.A. & Town, C.D. Cloning and sequencing of cDNAs for hypothetical genes from chromosome 2 of Arabidopsis. Plant Physiol. 130, 2118–2128 (2002).
Gibbings, J.G. et al. Global transcript analysis of rice leaf and seed using SAGE technology. Plant Biotechnol. J. 1, 271–285 (2003).
Bass, B.L. Double-stranded RNA as a template for gene silencing. Cell 101, 235–238 (2000).
Bass, B.L. RNA editing by adenosine deaminases that act on RNA. Annu. Rev. Biochem. 71, 817–846 (2002).
Mattick, J.S. & Gagen, M.J. The evolution of controlled multitasked gene networks: the role of introns and other noncoding RNAs in the development of complex organisms. Mol. Biol. Evol. 18, 1611–1630 (2001).
Lee, Y., Jeon, K., Lee, J.T., Kim, S. & Kim, V.N. MicroRNA maturation: stepwise processing and subcellular localization. EMBO J. 21, 4663–4670 (2002).
Marker, C. et al. Experimental RNomics: identification of 140 candidates for small non-messenger RNAs in the plant Arabidopsis thaliana. Curr. Biol. 12, 2002–2013 (2002).
Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916–919 (2002).
Touriol, C., Morillon, A., Gensac, M.C., Prats, H. & Prats, A.C. Expression of human fibroblast growth factor 2 mRNA is post-transcriptionally controlled by a unique destabilizing element present in the 3′-untranslated region between alternative polyadenylation sites. J. Biol. Chem. 274, 21402–21408 (1999).
Knirsch, L. & Clerch, L.B. A region in the 3′ UTR of MnSOD RNA enhances translation of a heterologous RNA. Biochem. Biophys. Res. Commun. 272, 164–168 (2000).
Edwalds-Gilbert, G., Veraldi, K.L. & Milcarek, C. Alternative poly(A) site selection in complex transcription units: means to an end? Nucleic Acids Res. 25, 2547–2561 (1997).
Iseli, C. et al. Long-range heterogeneity at the 3′ ends of human mRNAs. Genome Res. 12, 1068–1074 (2002).
Perepelitsa-Belancio, V. & Deininger, P. RNA truncation by premature polyadenylation attenuates human mobile element activity. Nat. Genet. 35, 363–366 (2003).
Quesada, V., Macknight, R., Dean, C. & Simpson, G.G. Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time. EMBO J. 22, 3142–3152 (2003).
Chudin, E. et al. Assessment of the relationship between signal intensities and transcript concentration for Affymetrix GeneChip arrays. Genome Biol. 3, published online 14 December 2001 (RESEARCH0005.1–0005.10, 2002).
Asano, T. et al. Construction of a specialized cDNA library from plant cells isolated by laser capture microdissection: toward comprehensive analysis of the genes expressed in the rice phloem. Plant J. 32, 401–408 (2002).
Birnbaum, K. et al. A gene expression map of the Arabidopsis root. Science 302, 1956–1960 (2003).
Acknowledgements
We thank Larry Tindell, Steve Edberg and Tanya Berardini for excellent technical assistance, and Hajime Sakai and Pam Green for critical reading of the manuscript. This work was supported by the National Science Foundation Plant Genome Research award #0110528 (B.C.M.).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
C.D.H. is an empoyee of Lynx Therapeutics, Inc., which is the company that produces the MPSS data. Our paper describes the use of these data for the analysis of transcription in Arabidopsis thaliana.
Supplementary information
Supplementary Table 1
Duplicated genomic signatures and 17-base MPSS data from five libraries.
Supplementary Table 2
Genes and alternative transcripts detected by MPSS signatures for all five libraries.
Supplementary Table 3
Comparison of MPSS and WGA Arabidopsis expression data.
Supplementary Table 4
Most highly expressed transcripts in the five libraries.
Supplementary Table 5
Identifiers and abundances of tissue-specific and constant genes.
Supplementary Table 6
Overlap in signatures comprising each library.
Supplementary Table 7
Signatures with distinct patterns of overlap among the five libraries.
Rights and permissions
About this article
Cite this article
Meyers, B., Vu, T., Tej, S. et al. Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing. Nat Biotechnol 22, 1006–1011 (2004). https://doi.org/10.1038/nbt992
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt992
This article is cited by
-
Genome-wide identification and characterization of NBS-encoding genes in Raphanus sativus L. and their roles related to Fusarium oxysporum resistance
BMC Plant Biology (2021)
-
Large scale study of anti-sense regulation by differential network analysis
BMC Systems Biology (2018)
-
Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana
BMC Genomics (2015)
-
Genome-wide development of novel miRNA-based microsatellite markers of rice (Oryza sativa) for genotyping applications
Molecular Breeding (2015)
-
Small RNA profiling reveals regulation of Arabidopsis miR168 and heterochromatic siRNA415 in response to fungal elicitors
BMC Genomics (2014)