Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Biological function of unannotated transcription during the early development of Drosophila melanogaster

This article has been updated


Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5′ exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Genomic distribution of detected transcription.
Figure 2: Gene expression across the first 24 h of embryonic development.
Figure 3: Computational assignment of unannotated transcription to RefSeq genes.
Figure 4: P element distribution in the genome.
Figure 5: Molecular genetic evidence for functionality of a newly described distal 5′ start site.
Figure 6: Examples of newly identified distal 5′ start sites.

Accession codes



Gene Expression Omnibus

Change history

  • 20 September 2006

    In the HTML version of this article initially published online, the largest pieces of two pie charts in Fig. 1a (labeled "2–4 h" and "20–22 h") were in the wrong position. The error has been corrected in the HTML version of the article.


  1. Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

    CAS  Article  Google Scholar 

  2. Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).

    CAS  Article  Google Scholar 

  3. Velculescu, V. et al. Serial analysis of gene expression. Science 270, 484–487 (1995).

    CAS  Article  Google Scholar 

  4. Wei, C.L. et al. 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotation. Proc. Natl. Acad. Sci. USA 101, 11701–11706 (2004).

    CAS  Article  Google Scholar 

  5. Ng, P. et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods 2, 105–111 (2005).

    CAS  Article  Google Scholar 

  6. Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).

    CAS  Article  Google Scholar 

  7. Misra, S. et al. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol. 3 RESEARCH0083 (2002).

  8. Hild, M. et al. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol. 5, R3 (2003).

    CAS  Article  Google Scholar 

  9. Celniker, S.E. et al. Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3 RESEARCH0079 (2002).

  10. Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655–660 (2004).

    CAS  Article  Google Scholar 

  11. Tomancak, P. et al. Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 3 RESEARCH0088 (2002).

  12. Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916–919 (2002).

    CAS  Article  Google Scholar 

  13. Kampa, D. et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004).

    CAS  Article  Google Scholar 

  14. Tadros, W. & Lipshitz, H.D. Setting the stage for development: mRNA translation and stability during oocyte maturation and egg activation in Drosophila Dyn. Dev. 232, 593–608 (2005).

    CAS  Google Scholar 

  15. Spradling, A.C. et al. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. USA 92, 10824–10830 (1995).

    CAS  Article  Google Scholar 

  16. Denholm, B. et al. crossveinless-c is a RhoGAP required for actin reorganisation during morphogenesis. Development 132, 2389–2400 (2005).

    CAS  Article  Google Scholar 

  17. Bernards, A. GAPs galore! A survey of putative Ras superfamily GTPase activating proteins in man and Drosophila. Biochim. Biophys. Acta 1603, 47–82 (2003).

    CAS  PubMed  Google Scholar 

  18. Thibault, S.T. et al. A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat. Genet. 36, 283–287 (2004).

    CAS  Article  Google Scholar 

  19. Shamloula, H.K. et al. rugose (rg), a Drosophila A kinase anchor protein, is required for retinal pattern formation and interacts genetically with multiple signaling pathways. Genetics 161, 693–710 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Su, Y. et al. Neurobeachin is essential for neuromuscular synaptic transmission. J. Neurosci. 24, 3627–3636 (2004).

    CAS  Article  Google Scholar 

  21. Strapps, W.R. & Tomlinson, A. Transducing properties of Drosophila Frizzled proteins. Development 128, 4829–4835 (2001).

    CAS  PubMed  Google Scholar 

  22. Beyer, A.L. & Osheim, T.N. Splice site selection, rate of splicing, and alternative splicing on nascent transcripts. Genes Dev. 2, 754–765 (1988).

    CAS  Article  Google Scholar 

  23. Foe, V.E. Mitotic domains reveal early commitment of cells in Drosophila embryos. Development 107, 1–22 (1989).

    CAS  PubMed  Google Scholar 

  24. Bellen, H.J. et al. The BDGP gene disruption project: Single transposon insertions associated with 40% of Drosophila genes. Genetics 167, 761–781 (2004).

    CAS  Article  Google Scholar 

  25. Jiang, J., Kosman, D., Ip, Y.T. & Levine, M. The dorsal morphogen gradient regulates the mesoderm determinant twist in early Drosophila embryos. Genes Dev. 5, 1881–1891 (1991).

    CAS  Article  Google Scholar 

  26. Tomayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999).

    Article  Google Scholar 

Download references


This project has been funded in part with Federal Funds from the National Cancer Institute, National Institutes of Health, under Contract No. N01-CO-12400, the National Human Genome Research Institute, National Institutes of Health, under Grant No. U01 HG003147, and Affymetrix, Inc.

Author information

Authors and Affiliations



J.R.M. and S.D. contributed equally to this work. J.R.M. initiated the project and headed the molecular genetics work. S.D. headed the bioinformatics work.

Corresponding author

Correspondence to J Robert Manak.

Ethics declarations

Competing interests

Other than F.B., all authors are employees of Affymetrix.

Supplementary information

Supplementary Fig. 1

Coverage of RefSeq genes by transfrags. (PDF 46 kb)

Supplementary Fig. 2

Examples of various transcript classes identified by sequencing of RT-PCR clones containing novel 5′ start sites. (PDF 159 kb)

Supplementary Fig. 3

Distribution of intensity ratios, SOM-based centroid profiles, and dispersion index. (PDF 969 kb)

Supplementary Table 1

Computationally predicted 5′ start sites. (PDF 109 kb)

Supplementary Table 2

Comprehensive spreadsheet of manually curated confirmed 5′ start sites. (PDF 91 kb)

Supplementary Table 3

Expression of RefSeq genes. (XLS 3075 kb)

Supplementary Methods (PDF 70 kb)

Supplementary Note (PDF 18 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Manak, J., Dike, S., Sementchenko, V. et al. Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat Genet 38, 1151–1158 (2006).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing