Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing


Large-scale sequencing of short mRNA-derived tags can establish the qualitative and quantitative characteristics of a complex transcriptome. We sequenced 12,304,362 tags from five diverse libraries of Arabidopsis thaliana using massively parallel signature sequencing (MPSS). A total of 48,572 distinct signatures, each representing a different transcript, were expressed at significant levels. These signatures were compared to the annotation of the A. thaliana genomic sequence; in the five libraries, this comparison yielded between 17,353 and 18,361 genes with sense expression, and between 5,487 and 8,729 genes with antisense expression. An additional 6,691 MPSS signatures mapped to unannotated regions of the genome. Expression was demonstrated for 1,168 genes for which expression data were previously unknown. Alternative polyadenylation was observed for more than 25% of A. thaliana genes transcribed in these libraries. The MPSS expression data suggest that the A. thaliana transcriptome is complex and contains many as-yet uncharacterized variants of normal coding transcripts.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout


  1. Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).

  2. Yamada, K. et al. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302, 842–846 (2003).

    Article  CAS  Google Scholar 

  3. Haas, B.J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).

    Article  CAS  Google Scholar 

  4. Andrews, J. et al. Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis. Genome Res. 10, 2030–2043 (2000).

    Article  CAS  Google Scholar 

  5. Guigo, R., Agarwal, P., Abril, J.F., Burset, M. & Fickett, J.W. An assessment of gene prediction accuracy in large DNA sequences. Genome Res. 10, 1631–1642 (2000).

    Article  CAS  Google Scholar 

  6. Haas, B.J. et al. Full-length messenger RNA sequences greatly improve genome annotation. Genome Biol. 3, published online 30 May 2002 (RESEARCH0029.21–0029.12, 2002).

  7. Eddy, S.R. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919–929 (2001).

    Article  CAS  Google Scholar 

  8. MacIntosh, G.C., Wilkerson, C. & Green, P.J. Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs. Plant Physiol. 127, 765–776 (2001).

    Article  CAS  Google Scholar 

  9. Vanhee-Brossollet, C. & Vaquero, C. Do natural antisense transcripts make sense in eukaryotes? Gene 211, 1–9 (1998).

    Article  CAS  Google Scholar 

  10. Wortman, J.R. et al. Annotation of the Arabidopsis genome. Plant Physiol. 132, 461–468 (2003).

    Article  CAS  Google Scholar 

  11. Brenner, S. et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18, 630–634 (2000).

    Article  CAS  Google Scholar 

  12. Brenner, S. et al. In vitro cloning of complex mixtures of DNA on microbeads: physical separation of differentially expressed cDNAs. Proc. Natl. Acad. Sci. USA 97, 1665–1670 (2000).

    Article  CAS  Google Scholar 

  13. Adams, M.D. et al. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377, 3–174 (1995).

    CAS  PubMed  Google Scholar 

  14. Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W. Serial analysis of gene expression. Science 270, 484–487 (1995).

    Article  CAS  Google Scholar 

  15. Audic, S. & Claverie, J.M. The significance of digital gene expression profiles. Genome Res. 7, 986–995 (1997).

    Article  CAS  Google Scholar 

  16. Hoth, S. et al. Monitoring genome-wide changes in gene expression in response to endogenous cytokinin reveals targets in Arabidopsis thaliana. FEBS Lett. 554, 373–380 (2003).

    Article  CAS  Google Scholar 

  17. Hoth, S. et al. Genome-wide gene expression profiling in Arabidopsis thaliana reveals new targets of abscisic acid and largely impaired gene regulation in the abi1–1 mutant. J. Cell Sci. 115, 4891–4900 (2002).

    Article  CAS  Google Scholar 

  18. Meyers, B.C., Morgante, M. & Michelmore, R.W. TIR-X and TIR-NBS proteins: two new families related to disease resistance TIR-NBS-LRR proteins encoded in Arabidopsis and other plant genomes. Plant J. 32, 77–92 (2002).

    Article  CAS  Google Scholar 

  19. Meyers, B.C. et al. The use of MPSS for whole-genome transcriptional analysis in Arabidopsis. Genome Res. in the press (2004).

  20. Lehner, B., Williams, G., Campbell, R.D. & Sanderson, C.M. Antisense transcripts in the human genome. Trends Genet. 18, 63–65 (2002).

    Article  CAS  Google Scholar 

  21. Terryn, N. & Rouze, P. The sense of naturally transcribed antisense RNAs in plants. Trends Plant Sci. 5, 394–396 (2000).

    Article  CAS  Google Scholar 

  22. Xiao, Y.L., Malik, M., Whitelaw, C.A. & Town, C.D. Cloning and sequencing of cDNAs for hypothetical genes from chromosome 2 of Arabidopsis. Plant Physiol. 130, 2118–2128 (2002).

    Article  CAS  Google Scholar 

  23. Gibbings, J.G. et al. Global transcript analysis of rice leaf and seed using SAGE technology. Plant Biotechnol. J. 1, 271–285 (2003).

    Article  CAS  Google Scholar 

  24. Bass, B.L. Double-stranded RNA as a template for gene silencing. Cell 101, 235–238 (2000).

    Article  CAS  Google Scholar 

  25. Bass, B.L. RNA editing by adenosine deaminases that act on RNA. Annu. Rev. Biochem. 71, 817–846 (2002).

    Article  CAS  Google Scholar 

  26. Mattick, J.S. & Gagen, M.J. The evolution of controlled multitasked gene networks: the role of introns and other noncoding RNAs in the development of complex organisms. Mol. Biol. Evol. 18, 1611–1630 (2001).

    Article  CAS  Google Scholar 

  27. Lee, Y., Jeon, K., Lee, J.T., Kim, S. & Kim, V.N. MicroRNA maturation: stepwise processing and subcellular localization. EMBO J. 21, 4663–4670 (2002).

    Article  CAS  Google Scholar 

  28. Marker, C. et al. Experimental RNomics: identification of 140 candidates for small non-messenger RNAs in the plant Arabidopsis thaliana. Curr. Biol. 12, 2002–2013 (2002).

    Article  CAS  Google Scholar 

  29. Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916–919 (2002).

    Article  CAS  Google Scholar 

  30. Touriol, C., Morillon, A., Gensac, M.C., Prats, H. & Prats, A.C. Expression of human fibroblast growth factor 2 mRNA is post-transcriptionally controlled by a unique destabilizing element present in the 3′-untranslated region between alternative polyadenylation sites. J. Biol. Chem. 274, 21402–21408 (1999).

    Article  CAS  Google Scholar 

  31. Knirsch, L. & Clerch, L.B. A region in the 3′ UTR of MnSOD RNA enhances translation of a heterologous RNA. Biochem. Biophys. Res. Commun. 272, 164–168 (2000).

    Article  CAS  Google Scholar 

  32. Edwalds-Gilbert, G., Veraldi, K.L. & Milcarek, C. Alternative poly(A) site selection in complex transcription units: means to an end? Nucleic Acids Res. 25, 2547–2561 (1997).

    Article  CAS  Google Scholar 

  33. Iseli, C. et al. Long-range heterogeneity at the 3′ ends of human mRNAs. Genome Res. 12, 1068–1074 (2002).

    Article  CAS  Google Scholar 

  34. Perepelitsa-Belancio, V. & Deininger, P. RNA truncation by premature polyadenylation attenuates human mobile element activity. Nat. Genet. 35, 363–366 (2003).

    Article  CAS  Google Scholar 

  35. Quesada, V., Macknight, R., Dean, C. & Simpson, G.G. Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time. EMBO J. 22, 3142–3152 (2003).

    Article  CAS  Google Scholar 

  36. Chudin, E. et al. Assessment of the relationship between signal intensities and transcript concentration for Affymetrix GeneChip arrays. Genome Biol. 3, published online 14 December 2001 (RESEARCH0005.1–0005.10, 2002).

  37. Asano, T. et al. Construction of a specialized cDNA library from plant cells isolated by laser capture microdissection: toward comprehensive analysis of the genes expressed in the rice phloem. Plant J. 32, 401–408 (2002).

    Article  CAS  Google Scholar 

  38. Birnbaum, K. et al. A gene expression map of the Arabidopsis root. Science 302, 1956–1960 (2003).

    Article  CAS  Google Scholar 

Download references


We thank Larry Tindell, Steve Edberg and Tanya Berardini for excellent technical assistance, and Hajime Sakai and Pam Green for critical reading of the manuscript. This work was supported by the National Science Foundation Plant Genome Research award #0110528 (B.C.M.).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Blake C Meyers.

Ethics declarations

Competing interests

C.D.H. is an empoyee of Lynx Therapeutics, Inc., which is the company that produces the MPSS data. Our paper describes the use of these data for the analysis of transcription in Arabidopsis thaliana.

Supplementary information

Supplementary Table 1

Duplicated genomic signatures and 17-base MPSS data from five libraries.

Supplementary Table 2

Genes and alternative transcripts detected by MPSS signatures for all five libraries.

Supplementary Table 3

Comparison of MPSS and WGA Arabidopsis expression data.

Supplementary Table 4

Most highly expressed transcripts in the five libraries.

Supplementary Table 5

Identifiers and abundances of tissue-specific and constant genes.

Supplementary Table 6

Overlap in signatures comprising each library.

Supplementary Table 7

Signatures with distinct patterns of overlap among the five libraries.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Meyers, B., Vu, T., Tej, S. et al. Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing. Nat Biotechnol 22, 1006–1011 (2004).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing