Article | Published:

Biological function of unannotated transcription during the early development of Drosophila melanogaster

Nature Genetics volume 38, pages 11511158 (2006) | Download Citation

Subjects

This article has been updated

Abstract

Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5′ exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 20 September 2006

    In the HTML version of this article initially published online, the largest pieces of two pie charts in Fig. 1a (labeled "2–4 h" and "20–22 h") were in the wrong position. The error has been corrected in the HTML version of the article.

Accessions

GenBank/EMBL/DDBJ

Gene Expression Omnibus

References

  1. 1.

    et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

  2. 2.

    et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).

  3. 3.

    et al. Serial analysis of gene expression. Science 270, 484–487 (1995).

  4. 4.

    et al. 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotation. Proc. Natl. Acad. Sci. USA 101, 11701–11706 (2004).

  5. 5.

    et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods 2, 105–111 (2005).

  6. 6.

    et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).

  7. 7.

    et al. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol. 3 RESEARCH0083 (2002).

  8. 8.

    et al. An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome. Genome Biol. 5, R3 (2003).

  9. 9.

    et al. Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3 RESEARCH0079 (2002).

  10. 10.

    et al. A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655–660 (2004).

  11. 11.

    et al. Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 3 RESEARCH0088 (2002).

  12. 12.

    et al. Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916–919 (2002).

  13. 13.

    et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004).

  14. 14.

    & Setting the stage for development: mRNA translation and stability during oocyte maturation and egg activation in Drosophila Dyn. Dev. 232, 593–608 (2005).

  15. 15.

    et al. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. USA 92, 10824–10830 (1995).

  16. 16.

    et al. crossveinless-c is a RhoGAP required for actin reorganisation during morphogenesis. Development 132, 2389–2400 (2005).

  17. 17.

    GAPs galore! A survey of putative Ras superfamily GTPase activating proteins in man and Drosophila. Biochim. Biophys. Acta 1603, 47–82 (2003).

  18. 18.

    et al. A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat. Genet. 36, 283–287 (2004).

  19. 19.

    et al. rugose (rg), a Drosophila A kinase anchor protein, is required for retinal pattern formation and interacts genetically with multiple signaling pathways. Genetics 161, 693–710 (2002).

  20. 20.

    et al. Neurobeachin is essential for neuromuscular synaptic transmission. J. Neurosci. 24, 3627–3636 (2004).

  21. 21.

    & Transducing properties of Drosophila Frizzled proteins. Development 128, 4829–4835 (2001).

  22. 22.

    & Splice site selection, rate of splicing, and alternative splicing on nascent transcripts. Genes Dev. 2, 754–765 (1988).

  23. 23.

    Mitotic domains reveal early commitment of cells in Drosophila embryos. Development 107, 1–22 (1989).

  24. 24.

    et al. The BDGP gene disruption project: Single transposon insertions associated with 40% of Drosophila genes. Genetics 167, 761–781 (2004).

  25. 25.

    , , & The dorsal morphogen gradient regulates the mesoderm determinant twist in early Drosophila embryos. Genes Dev. 5, 1881–1891 (1991).

  26. 26.

    et al. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999).

Download references

Acknowledgements

This project has been funded in part with Federal Funds from the National Cancer Institute, National Institutes of Health, under Contract No. N01-CO-12400, the National Human Genome Research Institute, National Institutes of Health, under Grant No. U01 HG003147, and Affymetrix, Inc.

Author information

Author notes

    • J Robert Manak
    •  & Sujit Dike

    These authors contributed equally to this work.

Affiliations

  1. Affymetrix, Inc., Santa Clara, California, 95051, USA.

    • J Robert Manak
    • , Sujit Dike
    • , Victor Sementchenko
    • , Philipp Kapranov
    • , Jeff Long
    • , Jill Cheng
    • , Ian Bell
    • , Srinka Ghosh
    • , Antonio Piccolboni
    •  & Thomas R Gingeras
  2. Department of Molecular and Cell Biology, Center for Integrative Genomics, University of California, Berkeley, California 94720-3200, USA.

    • Frederic Biemar

Authors

  1. Search for J Robert Manak in:

  2. Search for Sujit Dike in:

  3. Search for Victor Sementchenko in:

  4. Search for Philipp Kapranov in:

  5. Search for Frederic Biemar in:

  6. Search for Jeff Long in:

  7. Search for Jill Cheng in:

  8. Search for Ian Bell in:

  9. Search for Srinka Ghosh in:

  10. Search for Antonio Piccolboni in:

  11. Search for Thomas R Gingeras in:

Contributions

J.R.M. and S.D. contributed equally to this work. J.R.M. initiated the project and headed the molecular genetics work. S.D. headed the bioinformatics work.

Competing interests

Other than F.B., all authors are employees of Affymetrix.

Corresponding author

Correspondence to J Robert Manak.

Supplementary information

PDF files

  1. 1.

    Supplementary Fig. 1

    Coverage of RefSeq genes by transfrags.

  2. 2.

    Supplementary Fig. 2

    Examples of various transcript classes identified by sequencing of RT-PCR clones containing novel 5′ start sites.

  3. 3.

    Supplementary Fig. 3

    Distribution of intensity ratios, SOM-based centroid profiles, and dispersion index.

  4. 4.

    Supplementary Table 1

    Computationally predicted 5′ start sites.

  5. 5.

    Supplementary Table 2

    Comprehensive spreadsheet of manually curated confirmed 5′ start sites.

  6. 6.

    Supplementary Methods

  7. 7.

    Supplementary Note

Excel files

  1. 1.

    Supplementary Table 3

    Expression of RefSeq genes.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng1875

Further reading Further reading