The developmental transcriptome of Drosophila melanogaster


Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development.

Figure 1: Discovery of new RNAs in the Bithorax complex.
Figure 2: Discovery of small non-coding RNAs.
Figure 3: Dynamics of gene expression.
Figure 4: Developmentally regulated splicing events.
Figure 5: Discovery of RNA editing events.

Accession codes

Data deposits

All sequence data have been deposited in the SRA, cDNA sequences have been deposited in GenBank, and array data deposited in GEO (see Supplementary Table 35 for all accession numbers). All data is also available at


We thank C. Trapnell and L. Pachter for discussions and assistance with Cufflinks, and E. Clough for comments and feedback. A.N.B. was partially supported by an NSF graduate fellowship. This work was funded by an award from the National Human Genome Research INstitute modENCODE Project (U01 HB004271) to S.E.C. (Principal Investigator) and M.R.B., P.C., T.R.G., B.R.G. and N.P. (co-Principal Investigators) under Department of Energy contract no. DE-AC02-05CH11231, and by the National Institute of Diabetes and Digestive and Kidney Diseases Intramural Research Program (B.O.).

J.A., M.R.B., P.C., T.R.G., B.R.G., R.A.H., T.C.K., B.O., N.P. and S.E.C. designed the project. J.A., S.E.B., M.R.B., P.C., T.R.G., B.R.G., R.A.H., B.O. and S.E.C. managed the project. D.M. prepared biological samples. T.C.K. oversaw biological sample production. D.Z. and B.E. prepared RNA samples. J.A. oversaw RNA sample production. W.L. and A.W. analysed array data. P.K. managed array data production. L.Y. prepared Illumina RNA-Seq libraries. C.A.D., L.L., J.E.S., K.H.W. and L.Y. performed Illumina sequencing. J.M.L., B.R.G. and S.E.C. managed Illumina sequencing production. M.B. and R.E.G. performed 454 sequencing of adults. R.A.H. managed production of the embryonic SOLiD and 454 sequencing. C.A.D. managed data transfers. C.Z. managed databases and formatted array and sequence data for submission. C.G.A., P.J.B., S.E.B., A.N.B., S.D., M.O.D., B.R.G. and D.S. developed analysis methods. C.G.A., J.B.B., N.B., B.W.B., S.E.B., A.N.B., J.W.C., S.E.C., L.C., P.C., C.A.D., A.D., M.O.D., B.R.G., R.L., J.H.M., N.R.M., D.S. and Yi.Z. analysed data. B.B.T. aligned the SOLiD data. M.J.V. and J.M.L. generated annotations. C.G.A., D.S. and J.H.M. analysed species validation data. L.J., C.G.A., D.S. and N.R.M. performed species RNA-Seq quality control. Yu.Z. and J.H.M. oversaw sequencing and gathered species samples. C.G.A., A.N.B., J.W.C., L.C., P.C., A.H., D.S., J.M.L., R.L. N.R.M., J.H.M. and B.O. contributed to the text. A.H. assisted with manuscript preparation. B.R.G. and S.E.C. wrote the paper with input from all authors. All authors discussed the results and commented on the manuscript.

