In the Odyssey, Homer's hero has his hands full when he faces Proteus. The demigod challenges Odysseus by transforming himself into a lion, a boar, a serpent, a wave and finally a tree. In proteomics, scientists trying to discern the nature of proteins face an equally formidable challenge, because protein data are as mutable as Proteus. Protein levels in different cell types change constantly as they are upregulated, downregulated, cleaved and phosphorylated.

Because protein information, unlike DNA, is not static in the cell, scientists must follow Odysseus' lead. They will have to be resourceful, especially as the tools used in today's high-throughput environment still bear the stamp of an earlier era when one protein at a time was the standard.

The mass spectrometer is key to proteomics. Credit: SPL

The 2D gel used to separate individual proteins from complex mixtures dates back to the mid-1970s. Mass spectrometry, which identifies proteins by weight once they are isolated, has been around since the First World War. And industrial robots, used to usher the proteins through the intermediate steps that separate these two techniques, date back to the 1960s. Most venerable of all is the century-old separations technique of chromatography.

Fortunately for scientists aiming for widespread protein characterization in the wake of the triumph of genome sequencing, a series of improvements in mass spectrometry and 2D-gel technology is readying these tools for the task that lies ahead. Chromatography was modernized in the 1970s with the invention of high-pressure pumps, the addition of multiple columns and improved packing materials for columns, leading to its modern incarnation as high-performance liquid chromatography, or HPLC, a workhorse in many life-sciences labs.

Established scientific-equipment companies are also working to integrate more steps of the overall proteomics workflow into fewer pieces of equipment. And many start-up companies are looking for ways to enhance or supplant parts of the established proteomics process.

Although there are many different methods emerging — from mapping all the proteins in a single organism to describing the multitude of interactions experienced by proteins during their lifespan — the general technique of isolating and identifying the many proteins in different cell types remains central.

There are several possible starting points for protein identification. But the most well-travelled route into proteomics starts with a sample in a 2D gel being fed into an electrophoresis machine. This is followed by either automatic or manual picking and excision of the protein spots of interest, which are then fed into a mass spectrometer (see 'Mass Spectroscopy: Mix and match').

Celia Caulcott, who heads an effort by the UK's Biotechnology and Biological Sciences Research Council to develop new proteomics technologies, says that, despite a lot of R&D, traditional techniques for protein identification still stand. “The gels still seem to be the pre-eminent way people want to do things,” she says. Beguiling techniques such as protein arrays, which could supplant gels if successful, have yet to prove they can be viable both scientifically and commercially, she says.

Joakim Rodin, director for proteomics R&D at Amersham Biosciences, a biotech-equipment company based in Uppsala, Sweden, agrees that the gel system, although not the easiest thing to work with, has yet to be supplanted. “It's still a lot of work running the gels,” he says. But improvements in capacity, such as the company's Ettan Dalt II system, allows researchers to run up to 12 gels in parallel with more reproducibility and sensitivity.

Identifying spots on gels can be time consuming. Credit: NONLINEAR DYNAMICS

And the gels themselves have improved, he says. They are getting bigger, so more sample can be loaded, which improves the detection of low-abundance proteins. 'Zoom' gels have also been developed with ever-narrowing pH ranges, which give better resolution as well as higher sensitivity.

Fluorescent labelling is also getting better, he says. Differential-expression analysis using difference gel electrophoresis, developed at Carnegie Mellon University, allows up to three samples to be run simultaneously on a single gel using cyanine-dye chemistry. This should let researchers detect protein differences between normal and cancerous tissues on the same gel. The method also allows multiplexing of gels, which significantly increases throughput, reproducibility and accuracy. Multiple gels provide comparative analysis and accurate measurement of differential protein expression. Although the handling and analysis of 2D gels have improved dramatically, Rodin notes that complementary techniques, such as X-ray crystallography, are needed to resolve the whole proteome.

Fortunately, the next stage of the proteomics pipeline, handling the intermediate steps between electrophoresis and mass spectrometry, is becoming easier. Picking the protein spots off the gels, then digesting them into peptide fragments used to be two separate, manual tasks. Now they are becoming automated and are being integrated into the workflow (see 'Automation: Multiple choice'). But improving and combining individual components can be challenging, says Steve Martin, director of Applied Biosystems' Proteomics Research Center in Framingham, Massachusetts. For example, increasing the capacity of one instrument without accounting for the additional need for throughput in others can actually result in bottlenecks, he says.

Three commercial — and by today's standards, integrated — systems are made by Amersham Biosciences, Genomic Solutions in Ann Arbor, Michigan, and Bio-Rad in Hercules, California. Their basic components are similar — they all use robotic sample-preparation, 2D-gel electrophoresis, excision of spots, labelling, and ionization and analysis of the peptide fragments by mass spectrometry. In these systems, data generated from all the instruments are presented in a user-friendly graphical interface.

These stations are quite expensive but, just as core facilities for genome sequencing sprang up once the equipment came of age, the same is likely to happen with protein characterization. This should ensure that smaller academic and commercial labs will share in the advance of knowledge. And smaller labs might still be able to automate individual steps, such as spot picking or digestion, finding new ways to integrate steps that might be overlooked in larger, more streamlined organizations.

Alternatives for eliminating, rather than integrating, such steps are also emerging. One fairly new strategy involves transferring the gel to a membrane made of polyvinylidene difluoride (PVDF), then probing the membrane directly with mass spectrometry. This bypasses the spot-cutting step between electrophoresis and mass spectrometry.

Improvements also extend to mundane but essential items such as stains. Coomassie blue, a staple in most labs, can interfere with the digestion of gel spots by trypsin, so new stains such as zinc imidazole and noncovalent fluorescent SYPRO dyes, which do not have this limitation, are being introduced.

Mass-spectrometry output

It was not until the early 1990s that the mass spectrometers, now virtually essential components in the proteomics pipeline, could be used to analyse proteins.

Mass spectrometry relies on the fact that a substance carrying a net electric charge — an ion — can be made to move in a predictable way in an electromagnetic field. Ions are sorted by their charge-to-mass ratio, and from these a 'mass fingerprint' of the sample can be derived. Software, such as the University of California's Prospector package, can then be used to match the fingerprint to a protein database such as Amos Bairoch's Swiss-Prot (see 'Software: Setting standards').

In earlier models, excessive ionization energies would blast delicate molecules such as DNA and proteins into indecipherable particles. But innovations using a matrix such as MALDI, which protects the sample by modulating the ionizing laser beam, have helped to overcome this limitation.

Nevertheless, the technique still has its limits. A mass fingerprint will not be enough for identification if the protein is not registered in a database, or if post-translational modifications have changed its observed mass from the predicted value. In these instances, more information can be obtained from secondary protein fragments by re-routing the ions from the first analysis down a second channel and then analysing these fragments with the spectrometer. Of course, more complete databases will also help. And pairing mass spectrometry with other techniques, such as some kinds of protein-detector chip (see 'Chips: Alternative approaches') may make the method even more useful.

Future challenges

Automating and integrating the protein-characterization process is a good start, but there is no simple way forward. Although effective with adequate sample sizes, automated processes in general are not effective with very small amounts (less than 10 femtomoles of material).

It is hard enough to describe a single protein in a particular state. But things get even more difficult when trying to characterize thousands of proteins active at any time in various parts of the cell. Michael Washburn and Dirk Wolters at Syngenta Agricultural Discovery Institute in San Diego and John Yates at the Scripps Research Institute in La Jolla, California, have devised a system to separate and identify 1,484 proteins from the proteome of the yeast Saccharomyces cerevisiae (see Nature Biotechnol. 19, 242–247; 2001). But that relatively low number in the humble yeast doesn't begin to reveal the complexity in humans. For example, there are a thousand or more proteins involved in the G-signalling pathway, which regulates everything from the most basic activities of the cell (division, motility) to the most specialized ones (secretion, electrical excitability).

Perhaps the biggest hurdle is not in designing the equipment but in the conceptual realm. Researchers might know individual elements in a signal cascade, understand something about their function, and perhaps even have obtained their structure. But, explains Ehud Isacoff, a biophysicist at the University of California, Berkeley, scientists are still encumbered by a bias to view the overall picture as if it were made up of discrete events, with one protein handing a signal to another sequentially, in a series of 'stills'.

Leroy Hood (right) and Ruedi Aebersold.

What is really happening in the cell, Isacoff continues, “is that proteins are very localized, and dock against one another very precisely in assemblies, and signalling happens by molecular motions that propagate from one subunit to another”. New methodologies and systems of notation must be devised to describe these things, and a new breed of student has to be recruited who can think about them as concrete objects with specific structures and interactions.

In fact, these needs are being recognized and the integrative effort is under way on several fronts. Leroy Hood's Institute for Systems Biology in Seattle has been in existence since early last year (see Nature 407, 828–829; 2000), and Al Gilman's Alliance for Cell Signalling at Dallas set up shop a year ago (see Nature 407, 7; 2000). They aim at a holistic understanding of the cell in all of its pathways and interactions. New methodology — and, perhaps, improved equipment — may emerge from such efforts.

Al Gilman: seeking the cell's secrets.

And a Clinical Proteomics Initiative, under the aegis of the US National Institutes of Health, started seeking grant applications last month. One of its key elements will be the antibody consortium, says Lance Liotta of the National Cancer Institute and one of those engineering the enterprise. This will be modelled on the open-access but industry-supported SNP consortium that is mapping simple genetic variations. Support — both in terms of finance and willingness to donate antibodies — from industrial and academic groups is very enthusiastic, says Liotta. The consortium's ultimate goal is to develop and make available arrays of every antibody and every ligand in existence.

Other aspects of the NIH initiative are looking for new approaches to existing techniques. However, it's unlikely that any new technology will completely replace an old one. Instead, innovations arising from the initiative will probably occur alongside the stalwarts of electrophoresis, mass spectrometry and chromatography —further complicating the ever-changing face of proteomics.