A high-quality catalog of the Drosophila melanogaster proteome

Abstract

Understanding how proteins and their complex interaction networks convert the genomic information into a dynamic living organism is a fundamental challenge in biological sciences. As an important step towards understanding the systems biology of a complex eukaryote, we cataloged 63% of the predicted Drosophila melanogaster proteome by detecting 9,124 proteins from 498,000 redundant and 72,281 distinct peptide identifications. This unprecedented high proteome coverage for a complex eukaryote was achieved by combining sample diversity, multidimensional biochemical fractionation and analysis-driven experimentation feedback loops, whereby data collection is guided by statistical analysis of prior data. We show that high-quality proteomics data provide crucial information to amend genome annotation and to confirm many predicted gene models. We also present experimentally identified proteotypic peptides matching 50% of D. melanogaster gene models. This library of proteotypic peptides should enable fast, targeted and quantitative proteomic studies to elucidate the systems biology of this model organism.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Description of the directed shotgun proteomics workflow.
Figure 2: Count of distinct protein identifications as a function of overall high-quality peptide identifications and results of several Monte Carlo simulations.
Figure 3: Benefits of the ADE strategy visualized by combined protein parameter histograms and difference histograms.
Figure 4: Additional biases among the experimentally identified proteins.
Figure 5: Identification of new genes/exons using proteomics data.
Figure 6: Outline of a typical PTP experiment.

References

  1. 1

    Pennisi, E. Searching for the genome's second code. Science 306, 632–635 (2004).

    CAS  Article  Google Scholar 

  2. 2

    Tupy, J.L. et al. Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 102, 5495–5500 (2005).

    CAS  Article  Google Scholar 

  3. 3

    Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).

    CAS  Article  Google Scholar 

  4. 4

    Desiere, F. et al. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome Biol. 6, R9 (2005).

    Article  Google Scholar 

  5. 5

    Anderson, N.L. & Anderson, N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteomics 1, 845–867 (2002).

    CAS  Article  Google Scholar 

  6. 6

    de Godoy, L.M. et al. Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol. 7, R50 (2006).

    Article  Google Scholar 

  7. 7

    Duret, L. & Mouchiroud, D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc. Natl. Acad. Sci. USA 96, 4482–4487 (1999).

    CAS  Article  Google Scholar 

  8. 8

    Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).

    CAS  Article  Google Scholar 

  9. 9

    Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E.L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001).

    CAS  Article  Google Scholar 

  10. 10

    Bendtsen, J.D., Nielsen, H., von Heijne, G. & Brunak, S. Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783–795 (2004).

    Article  Google Scholar 

  11. 11

    Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003).

    CAS  Article  Google Scholar 

  12. 12

    Nesvizhskii, A.I. & Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 4, 1419–1440 (2005).

    CAS  Article  Google Scholar 

  13. 13

    Komatsu, M. et al. A novel protein-conjugating system for Ufm1, a ubiquitin-fold modifier. EMBO J. 23, 1977–1986 (2004).

    CAS  Article  Google Scholar 

  14. 14

    Washburn, M.P., Wolters, D. & Yates, J.R., III. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).

    CAS  Article  Google Scholar 

  15. 15

    Tao, W.A. et al. Quantitative phosphoproteome analysis using a dendrimer conjugation chemistry and tandem mass spectrometry. Nat. Methods 2, 591–598 (2005).

    CAS  Article  Google Scholar 

  16. 16

    Ong, S.E. & Mann, M. Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 1, 252–262 (2005).

    CAS  Article  Google Scholar 

  17. 17

    Johansson, K.C., Metzendorf, C. & Soderhall, K. Microarray analysis of immune challenged Drosophila hemocytes. Exp. Cell Res. 305, 145–155 (2005).

    CAS  Article  Google Scholar 

  18. 18

    Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655–660 (2004).

    CAS  Article  Google Scholar 

  19. 19

    Pan, S. et al. High throughput proteome screening for biomarker detection. Mol. Cell. Proteomics 4, 182–190 (2005).

    CAS  Article  Google Scholar 

  20. 20

    Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).

    CAS  Article  Google Scholar 

  21. 21

    Zhang, H. et al. High-throughput quantitative analysis of serum proteins using glycopeptide capture and liquid chromatography mass spectrometry. Mol. Cell. Proteomics 4, 144–155 (2005).

    CAS  Article  Google Scholar 

  22. 22

    Corthals, G.L., Aebersold, R., Goodlett, D.R. & Burlingame, A.L. Mass spectrometry: modified proteins and glycoconjugates. in Methods in Enzymology Vol. 405 (ed. Burlingame, A.L.) 66–81, (Academic Press, Boston, 2005).

    Google Scholar 

  23. 23

    Krijgsveld, J. et al. Metabolic labeling of C. elegans and D. melanogaster for quantitative proteomics. Nat. Biotechnol. 21, 927–931 (2003).

    CAS  Article  Google Scholar 

  24. 24

    Gygi, S.P. et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999).

    CAS  Article  Google Scholar 

  25. 25

    Gopal, S. et al. Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome. Nat. Genet. 27, 337–340 (2001).

    CAS  Article  Google Scholar 

  26. 26

    Rubin, G.M. et al. A Drosophila complementary DNA resource. Science 287, 2222–2224 (2000).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank Bernd Roschitzki, Bertran Gerrits, Eva Niederer, Marko Jovanovic, Cristian Köpfli and Michael Walser for technical help, Hans Jespersen and Soeren Schandorff from Proxeon Bioinformatics for discussions regarding the proteotypic peptide data analysis and Hubert K. Rehrauer for help with statistical analysis. The project was funded by the University Research Priority Program Systems Biology/Functional Genomics of the University of Zurich. E.B., S.M., S.S. and S.L. are members of the Center for Model Organism Proteomes (C-MOP) which is funded by the University of Zurich (http://www.mop.unizh.ch). S.L. was supported by a Career Development Award of the University of Zurich. This work was also supported in part by a UBS grant to E.B. and K. Basler, and with federal funds from the US National Heart, Lung, and Blood Institute, National Institutes of Health under contract No. N01-HV-28179.

Author information

Affiliations

Authors

Contributions

E.B. and S.M. conducted most of the experimental work; E.B. performed most of the LTQ measurements, coordinated the interdisciplinary project and carried out the proteomics-based genome annotation work. S.M. performed the GO analyses; C.H.A. coordinated and carried out the bioinformatic analyses of the data set (Pfam analysis with C.P.), and conceived the ADE strategy; H.B. established the necessary statistical infrastructure for the project, carried out statistical analyses together with C.H.A. and performed the simulations; S.L. implemented the SBEAMS database, generated the Drosophila Peptide Atlas (supported by E.W.D.) and supported E.B with the proteomics-based genome annotation work; F.P. and C.P. (supported by E.W.D.) implemented and maintained the computational infrastructure at the Functional Genomics Center Zurich (FGCZ); U.L. provided the protein parameter computations; O.R. performed the gel-filtration experiments; H.L. helped with the setup of the LTQ instrument and LTQ MS analyses; P.G.A.P. generated the software for FCF calculations; J.M. set up the Free Flow Electrophoresis system and helped with experiments; K.K. and S.S. helped with selected experiments; F.K. helped with LCQ, and the initial experimental strategy; J.K. shared large data sets and performed the LTQ and LTQ-FT-ICR measurements in the lab of A.J.R.H.; R.S. supported the project with FGCZ resources; E.H. and R.A initiated the project and provided intellectual and financial support; R.A carried senior authorship responsibility.

Corresponding authors

Correspondence to Erich Brunner or Christian H Ahrens or Ruedi Aebersold.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

Schematic drawing that depicts the various steps and procedures used to prepare samples for mass spectrometric analysis. (PDF 381 kb)

Supplementary Fig. 2

Visualization of the statistical analysis of protein parameter distributions of all Drosophila proteins (population) and a subset of experimentally identified proteins (sample) using graphs of cumulative distributions and combined histograms. (PDF 524 kb)

Supplementary Table 1

The Drosophila melanogaster proteome based on Berkeley Drosophila Genome Project (BDGP) release 3.2. (DOC 27 kb)

Supplementary Table 2

Overview of all experiments grouped by developmental stage or cell line. (DOC 121 kb)

Supplementary Table 3

PFAM and GO slim analysis. (DOC 324 kb)

Supplementary Table 4

List of unvalidated peptides identified by cross-comparative database searches. (XLS 60 kb)

Supplementary Table 5

List of experimentally observed proteotypic peptides (PTPs). (XLS 2743 kb)

Supplementary Results (DOC 115 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Brunner, E., Ahrens, C., Mohanty, S. et al. A high-quality catalog of the Drosophila melanogaster proteome. Nat Biotechnol 25, 576–583 (2007). https://doi.org/10.1038/nbt1300

Download citation

Further reading

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing