Journal home
Advance online publication
Current issue
Press releases
Guide to authors
Online submissionOnline submission
For referees
Free online issue
Contact the journal
For Advertisers
About this site
For librarians
Application notes
NPG Resources
Nature Biotechnology
Nature Protocols
Nature Genetics
Nature Chemical Biology
Nature Cell Biology
Nature Neuroscience
Nature Reviews Genetics
Nature Reviews Molecular Cell Biology
Nature Reviews Drug Discovery
Nature Conferences
NPG Subject areas
Clinical Medicine
Drug Discovery
Earth Sciences
Evolution & Ecology
Materials Science
Medical Research
Molecular Cell Biology
Browse all publications
Research Highlights
Nature Methods - 4, 470 - 471 (2007)


From spectral networks to shotgun sequencing

Allison Doerr

Researchers demonstrate a new paradigm for mass spectrometry–based peptide identification and de novo protein sequencing.

Most researchers in the mass spectrometry (MS)-based proteomics field take it for granted that at some point, they are going to need to do a database search to match the mass spectra of their peptides with those in a database to identify the peptide sequences and by extension, their parent proteins. But this process becomes difficult and painfully slow for peptides containing multiple post-translational modifications (PTMs). It also becomes pretty much impossible to identify proteins from organisms with unsequenced genomes, for which neither sequence nor spectral databases exist.

So what is a curious researcher to do? If you are Pavel Pevzner, a computer scientist at the University of California, San Diego, you think of something a bit out of the ordinary. Pevzner and his graduate student Nuno Bandeira recently reported a strategy to perform database searching without ever comparing a spectrum to a database (Bandeira et al., 2007a).

Rather than search a database to interpret a peptide mass spectrum, Bandeira, Pevzner and their coworkers developed the concept of spectral networks, using spectral alignment to discover related spectra. For example, two versions of the same peptide, one that contains post-translational modifications, and one without, will have related spectra, as will peptides (born of the same protein) with overlapping sequences. Pevzner explains the concept with an analogy: "Suppose you started from hundreds of spectra that are not related; they're kind of like cities. You are connecting them by roads. And all of a sudden, the spectra make sense because when they are connected by roads, you can use neighbors to interpret what is in every city."

To illustrate just how powerful the concept is, Bandeira and Pevzner combined efforts with Karl Clauser of the Broad Institute to demonstrate how spectral networks can be used to reconstruct protein sequences from unpurified mixtures of unknown proteins (Bandeira et al., 2007b). They apply a variety of proteases with different specificities to generate peptides with overlapping sequences. They use the spectral network of the overlapping peptide fragments to construct a 'virtual' MS/MS spectrum of very high quality, which can then be used to determine the sequence of the whole protein.

Bandeira and Pevzner investigated the venom of the western diamondback rattlesnake, as an example of a potentially medically important proteome from an organism for which the genome sequence has yet to be determined. Not only did they demonstrate for the first time that de novo protein sequencing from a crude biological mixture was possible, but importantly, "Because venom changes depending on the season of the year that it's collected, and geographical reasons [and so forth], we found single nucleotide polymorphism variants in the sample as well," says Bandeira.

Though slow and laborious, the present gold standard for protein sequencing is Edman degradation. "Implicitly, we have nothing against Edman degradation, but we feel that with this technique, Edman degradation becomes unnecessary," says Pevzner. "The number of amino acids we find in a single experiment is in the thousands;...with Edman degradation no one is able to reach anything close."

Bandeira and Pevzner are confident that their concept of spectral networks will become an important new paradigm in MS-based proteomics, as they have welcomed quite a bit of interest from new collaborators. "While we have demonstrated these methods for mixtures of proteins, these are still somewhat small mixtures of proteins," says Bandeira. "It will be exciting to see how these tools scale to whole proteomes."

  1. Bandeira, N. et al. Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007a).
  2. Bandeira, N. et al. Shotgun protein sequencing: assembly of tandem mass spectra from mixtures of modified proteins. Mol. Cell. Proteomics; published online 19 April 2007b.
Previous | Next
Table of contents
Download PDFDownload PDF
Send to a friendSend to a friend
rights and permissionsRights and permissions
Save this linkSave this link
Export citation
Export references



Search buyers guide:


Nature Methods
ISSN: 1548-7091
EISSN: 1548-7105
Journal home | Current issue | Archive | Press releases |
Nature Publishing Group, publisher of Nature, and other science journals and reference works©2007 Nature Publishing Group | Privacy policy