Abstract
Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5′ exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.
Access options
Subscribe to Journal
Get full journal access for 1 year
70,80 €
only 5,90 € per issue
All prices include VAT for France.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
Change history
20 September 2006
In the HTML version of this article initially published online, the largest pieces of two pie charts in Fig. 1a (labeled "2–4 h" and "20–22 h") were in the wrong position. The error has been corrected in the HTML version of the article.
References
- 1.
Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).
- 2.
Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).
- 3.
Velculescu, V. et al. Serial analysis of gene expression. Science 270, 484–487 (1995).
- 4.
Wei, C.L. et al. 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotation. Proc. Natl. Acad. Sci. USA 101, 11701–11706 (2004).
- 5.
Ng, P. et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods 2, 105–111 (2005).
- 6.
Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).
- 7.
Misra, S. et al. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol. 3 RESEARCH0083 (2002).
- 8.
Hild, M. et al. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol. 5, R3 (2003).
- 9.
Celniker, S.E. et al. Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3 RESEARCH0079 (2002).
- 10.
Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655–660 (2004).
- 11.
Tomancak, P. et al. Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 3 RESEARCH0088 (2002).
- 12.
Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916–919 (2002).
- 13.
Kampa, D. et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004).
- 14.
Tadros, W. & Lipshitz, H.D. Setting the stage for development: mRNA translation and stability during oocyte maturation and egg activation in Drosophila Dyn. Dev. 232, 593–608 (2005).
- 15.
Spradling, A.C. et al. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. USA 92, 10824–10830 (1995).
- 16.
Denholm, B. et al. crossveinless-c is a RhoGAP required for actin reorganisation during morphogenesis. Development 132, 2389–2400 (2005).
- 17.
Bernards, A. GAPs galore! A survey of putative Ras superfamily GTPase activating proteins in man and Drosophila. Biochim. Biophys. Acta 1603, 47–82 (2003).
- 18.
Thibault, S.T. et al. A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat. Genet. 36, 283–287 (2004).
- 19.
Shamloula, H.K. et al. rugose (rg), a Drosophila A kinase anchor protein, is required for retinal pattern formation and interacts genetically with multiple signaling pathways. Genetics 161, 693–710 (2002).
- 20.
Su, Y. et al. Neurobeachin is essential for neuromuscular synaptic transmission. J. Neurosci. 24, 3627–3636 (2004).
- 21.
Strapps, W.R. & Tomlinson, A. Transducing properties of Drosophila Frizzled proteins. Development 128, 4829–4835 (2001).
- 22.
Beyer, A.L. & Osheim, T.N. Splice site selection, rate of splicing, and alternative splicing on nascent transcripts. Genes Dev. 2, 754–765 (1988).
- 23.
Foe, V.E. Mitotic domains reveal early commitment of cells in Drosophila embryos. Development 107, 1–22 (1989).
- 24.
Bellen, H.J. et al. The BDGP gene disruption project: Single transposon insertions associated with 40% of Drosophila genes. Genetics 167, 761–781 (2004).
- 25.
Jiang, J., Kosman, D., Ip, Y.T. & Levine, M. The dorsal morphogen gradient regulates the mesoderm determinant twist in early Drosophila embryos. Genes Dev. 5, 1881–1891 (1991).
- 26.
Tomayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999).
Acknowledgements
This project has been funded in part with Federal Funds from the National Cancer Institute, National Institutes of Health, under Contract No. N01-CO-12400, the National Human Genome Research Institute, National Institutes of Health, under Grant No. U01 HG003147, and Affymetrix, Inc.
Author information
Author notes
- J Robert Manak
- & Sujit Dike
These authors contributed equally to this work.
Affiliations
Affymetrix, Inc., Santa Clara, California, 95051, USA.
- J Robert Manak
- , Sujit Dike
- , Victor Sementchenko
- , Philipp Kapranov
- , Jeff Long
- , Jill Cheng
- , Ian Bell
- , Srinka Ghosh
- , Antonio Piccolboni
- & Thomas R Gingeras
Department of Molecular and Cell Biology, Center for Integrative Genomics, University of California, Berkeley, California 94720-3200, USA.
- Frederic Biemar
Authors
Search for J Robert Manak in:
Search for Sujit Dike in:
Search for Victor Sementchenko in:
Search for Philipp Kapranov in:
Search for Frederic Biemar in:
Search for Jeff Long in:
Search for Jill Cheng in:
Search for Ian Bell in:
Search for Srinka Ghosh in:
Search for Antonio Piccolboni in:
Search for Thomas R Gingeras in:
Contributions
J.R.M. and S.D. contributed equally to this work. J.R.M. initiated the project and headed the molecular genetics work. S.D. headed the bioinformatics work.
Competing interests
Other than F.B., all authors are employees of Affymetrix.
Corresponding author
Correspondence to J Robert Manak.
Supplementary information
PDF files
- 1.
Supplementary Fig. 1
Coverage of RefSeq genes by transfrags.
- 2.
Supplementary Fig. 2
Examples of various transcript classes identified by sequencing of RT-PCR clones containing novel 5′ start sites.
- 3.
Supplementary Fig. 3
Distribution of intensity ratios, SOM-based centroid profiles, and dispersion index.
- 4.
Supplementary Table 1
Computationally predicted 5′ start sites.
- 5.
Supplementary Table 2
Comprehensive spreadsheet of manually curated confirmed 5′ start sites.
- 6.
Supplementary Methods
- 7.
Supplementary Note
Excel files
- 1.
Supplementary Table 3
Expression of RefSeq genes.
Rights and permissions
To obtain permission to re-use content from this article visit RightsLink.
About this article
Further reading
-
1.
Landscape and evolution of tissue-specific alternative polyadenylation across Drosophila species
Genome Biology (2017)
-
2.
Expansion of the mutually exclusive spliced exome in Drosophila
Nature Communications (2013)
-
3.
Dynamic reprogramming of chromatin accessibility during Drosophila embryo development
Genome Biology (2011)
-
4.
BMC Genomics (2011)
-
5.
Hundreds of putatively functional small open reading frames in Drosophila
Genome Biology (2011)