Journal home
Advance online publication
Current issue
Archive
Press releases
Supplements
Focuses
Conferences
Guide to authors
Online submissionOnline submission
Permissions
For referees
Free online issue
Contact the journal
Subscribe
Advertising
work@npg
naturereprints
About this site
For librarians
 
NPG Resources
Bioentrepreneur
Nature Reviews Drug Discovery
Nature
Nature Medicine
Nature Genetics
Nature Reviews Genetics
Nature Methods
Nature Chemical Biology
news@nature.com
Clinical Pharmacology & Therapeutics
Nature Conferences
NPG Subject areas
Biotechnology
Cancer
Chemistry
Clinical Medicine
Dentistry
Development
Drug Discovery
Earth Sciences
Evolution & Ecology
Genetics
Immunology
Materials Science
Medical Research
Microbiology
Molecular Cell Biology
Neuroscience
Pharmacology
Physics
Browse all publications
News and Views
Nature Biotechnology  22, 1242 - 1243 (2004)
doi:10.1038/nbt1004-1242

Climbing the protein ladder

Norman J Dovichi

Norman J. Dovichi is at the Department of Chemistry, University of Washington, Seattle, WA 98195, USA. dovichi@chem.washington.edu

A simple hydrolysis-based approach to peptide analysis may facilitate high-throughput identification of post-translational modifications.
Gene sequences are transcribed from DNA to RNA, which is then translated into the corresponding proteins. The proteins themselves can then undergo a set of enzymatic reactions to generate post-translational modifications of specific amino acids. These modifications, such as phosphorylation and glycosylation, have a profound influence on the function of proteins. Although the determination of the primary amino acid sequence of a protein is routine, scanning an entire protein for post-translational modifications has been quite difficult until now. In this issue, Li and colleagues1 report a simple but powerful approach for detecting almost all possible post-translational modifications in a protein and apply the method to characterize a set of proteins from Escherichia coli.

Li's method has its genesis in the earliest protein analysis technologies. It was discovered in the late 1800s that acid treatment hydrolyzes a protein into its constituent amino acids. These amino acids could then be characterized through complex and sophisticated chemical methods, as codified by van Slyke2. Such analyses identified the composition of the amino acids in the protein, but not their sequence. Hundreds of grams of protein were required for these procedures, restricting their use to those few proteins that were available in large quantities.

Biology changed 50 years ago with the development of instrumental methods of analysis. In particular, partition chromatography provided a much faster and more sensitive method for the identification of amino acids. Martin and Synge, the developers of partition chromatography, applied it to analyze the acid hydrolysis products of gramicidin S, thereby determining the first sequence of a peptide; their work was recognized with the Nobel Prize in chemistry in 1952 (ref. 3). Frederick Sanger extended their method to determine the primary amino acid sequence of insulin, for which he received his first Nobel Prize in chemistry in 1958 (ref. 4). The acid hydrolysis method was quite tedious and was soon supplanted by Pehr Edman's phenylisothiocyanate method, which sequentially cleaves the N-terminal amino acid from a protein, followed by chromatographic identification5. Edman's method was used by two generations of scientists as the workhorse tool for protein sequencing.

Edman chemistry has fallen by the wayside over the past decade because of two developments. First, large-scale genomic sequencing efforts have generated databases with the sequence of all genes for an organism; the genetic code is then used to create a database of proteins for that organism. Second, matrix-assisted laser desorption ionization (MALDI) and electrospray mass spectroscopic methods were developed for the analysis of large biomolecules; Tanaka and Fenn shared the 2002 Nobel Prize in chemistry for this work6, 7. When combined with tandem mass spectrometry (MS/MS), these ionization methods allow protein identification by comparing partial amino acid sequence obtained from a small number of tryptic digest fragments with the protein database for that organism.

Although extremely powerful, mass analysis of tryptic peptides is not without limitations. Genomic databases are usually silent on alternative splicing, wherein exons can be shuffled to create different proteins from a single gene. Analysis of tryptic peptides usually does not allow identification of the particular splice form of the gene. More importantly, mass analysis of tryptic peptides very rarely covers the full length of the protein; it is not possible to identify post-translational modifications that occur on the lost peptides.

Martin and Synge predicted that "the characterization of the lower peptides resulting from the partial hydrolysis of proteins should be the most valuable method of determining the order of amino acids in proteins and protein structure"8. Li and co-workers have revisited this prediction and indeed have developed a powerful method for the determination of protein sequence and post-translational modification. Acid hydrolysis is the breakage of the peptide bond between two amino acids in a protein, and is usually carried out with a high concentration of acid at high temperature. Li reports that carrying out acid hydrolysis under microwave irradiation leads to the controlled hydrolysis of the protein, breaking a single peptide bond and thereby creating two peptides, one containing the N terminus and the other containing the C terminus. This observation is surprising because acid hydrolysis usually results in multiple breaks, yielding a complex mixture of peptides that is not useful for protein mass spectrometry.

Microwave-induced acid hydrolysis seems to occur randomly along the protein's length, creating a set of all possible N- and C-terminal peptides, even for proteins containing acid-labile asparagine-proline bonds. These peptides form a ladder, by analogy to the sequencing ladder created in Sanger's DNA sequencing reaction9. Mass analysis of this peptide ladder (MAP analysis) is used to characterize each amino acid in the protein. Protein ladders have been generated in the past, usually by enzymatic digestion. However, the generation of full-length ladders that span the entire sequence of a protein is an exciting development.

As an illustration of the approach consider the A chain of insulin, which consists of the 21-residue sequence GIVEQCCTSICSLYQLENYCN. Controlled acid hydrolysis produces a set of 20 peptides formed by a single-strand break, creating an N-terminal ladder and a C-terminal ladder. Each ladder generates its own mass spectrum and software is used to unwrap the mass data from the two ladders. Figure 1 presents an idealized spectrum for the N-terminal ladder generated from insulin. The amplitude of each peak will vary depending on its generation rate and its ionization efficiency.

Figure 1. Analysis of the N-terminal ladder of insulin by mass analysis of peptides (MAP).
Figure 1 thumbnail

A protein undergoes controlled acid digestion to create random breaks leading to the creation of two ladders, one consisting of peptides containing the N terminus of the protein and the other containing the C terminus. The figure shows a highly schematic example of mass analysis of the N-terminal ladder of the A chain of insulin. The sequence of the protein is determined by the mass differences between peaks, and post-translational modifications are identified by anomalous values for the differences. A real mass spectrum would also shows peaks corresponding to the C-terminal ladder, and software is used to distinguish the two. The peak heights in the spectrum depend on the rate of bond cleavage and the ionization efficiency of the peptide.



Full FigureFull Figure and legend (68K)
The difference in mass between adjacent peaks is given by the mass of the corresponding amino acid. For an unmodified protein, the protein sequence can be read directly from those mass differences. As an example, Li and colleagues show a set of ladders that span the length of lysozyme (129 amino acids), allowing determination of its sequence based on a single mass spectrum. This de novo sequencing will be important for proteins obtained from organisms whose genomes have not been sequenced. It will be even more important for sequencing of proteins produced from alternative splice forms of a gene and for detecting the loss of signaling sequences from the N terminus of a protein.

Proteins that contain post-translational modifications will generate mass differences that do not correspond to the mass of amino acids. Perhaps most importantly, phosphorylation can be identified by the 80-Dalton mass shift associated with the phosphate group. As an example, Li and colleagues determine six phosphorylation sites of alpha-casein in a single mass spectrum. The authors also demonstrate a technique for the identification of acetylation and heme attachment sites. Of course, acid-labile modifications are lost during this procedure, but such modifications are rare. Li and colleagues present a tour de force application of this technology to scan a set of 28 proteins from an E. coli homogenate, where loss of the N-terminal methionine residue and other signal peptides, the oxidation of methionine, and formation of disulfide bonds were observed.

The technology has several obvious applications in high-throughput scanning for post-translational modifications across a proteome. One can imagine the use of multidimensional chromatography or capillary electrophoresis for the separation of proteins from a cellular homogenate and fraction collection in high-density microtiter plates. Microwave irradiation can then be used to generate peptide ladders, with subsequent spotting on high-density MALDI plates for MAP analysis.

The method has some limitations. Like all mass spectrometry−based protein analysis methods, ladder-based sequencing can not distinguish between leucine and isoleucine, which have identical mass, and a high-resolution instrument is required to distinguish between glutamine and lysine, which have similar mass. Li reports that current time-of-flight instruments provide low sensitivity for proteins of higher molecular mass and are not useful in the analysis of ladders prepared from proteins with more than approx150 residues1. The use of Fourier transform ion cyclotron resonance instruments also promises to extend the mass range of this approach to larger proteins. Finally, the method does not provide details on glycosylation patterns, whose determination remains a formidable hurdle.

 Top
REFERENCES
  1. Zhong, H.Y., Zhang, Y., Wen, Z.H. & Li, L. Nat. Biotechnol. 22, XX−XX (2004).
  2. van Slyke, D.D. J. Biol. Chem. 10, 15−55 (1911).
  3. Gordon, A.H., Martin, A.J.P. & Synge, R.L.M. Biochem. J. 35, 1369−1387 (1941). | ChemPort |
  4. Sanger, F. Science 129, 1340−1344 (1959). | PubMed  | ISI | ChemPort |
  5. Edman, P. Eur. J. Biochem. 1, 80−91 (1967). | PubMed  | ISI | ChemPort |
  6. Tanaka, K. et al. Rapid Commun. Mass Spectrom. 2, 151−153 (1988). | ChemPort |
  7. Fenn, J.B., Mann, M., Meng, C.K., Wong, S.F. & Whitehouse, C.M. Science 246, 64−71 (1989). | PubMed  | ISI | ChemPort |
  8. Consden, R., Gordon, A.H., Martin, A.J.P. & Synge, R.L.M. Biochem. J. 41, 596−602 (1947). | ISI | ChemPort |
  9. Sanger, F., Nicklen, S. & Coulson, A.R. Proc. Natl. Acad. Sci. USA 74, 5463−5467 (1977). | PubMed  | ChemPort |
 Top
FULL TEXT
Previous | Next
Table of contents
Download PDFDownload PDF
Send to a friendSend to a friend
More articles like this

naturejobs

Figures & Tables
References
See also: Research by Zhong et al.
Export citation
Export references
natureproducts

Search buyers guide:

 
ADVERTISEMENT
 
Nature Biotechnology
ISSN: 1087-0156
EISSN: 1546-1696
Journal home | Advance online publication | Current issue | Archive | Press releases | Supplements | Focuses | Conferences | For authors | Online submission | Permissions | For referees | Free online issue | About the journal | Contact the journal | Subscribe | Advertising | work@npg | naturereprints | About this site | For librarians
Nature Publishing Group, publisher of Nature, and other science journals and reference works©2004 Nature Publishing Group | Privacy policy