To explore the full complexity and function of the human proteome, it is essential to establish a comprehensive, characterized and standardized collection of specific binding molecules ('binders') directed against all individual human proteins, including variant forms and modifications. Primed with the knowledge of the human genome, such a systematic bank of affinity reagents would be a crucial precompetitive resource to understand and exploit the proteome1. Yet although affinity reagents are undeniably of central importance for proteomics, they presently cover only a very small fraction of the proteome, and even though there are many antibodies against some targets (for example, >900 antibodies against p53), there are none against the vast majority of proteins. Moreover, widely accepted standards for binder characterization are virtually nonexistent. Establishing a binder collection will not be an end in itself, but must be accompanied by development of high-throughput assay systems and new-generation protein-detection technologies. The benefits would include cost-effective reagent production and access as well as improved interlaboratory reproducibility, and will have an impact on basic research and medicine as well as the biotechnology and pharmaceutical industries.

ProteomeBinders: vision and goals

ProteomeBinders is a new European consortium with the vision of establishing an infrastructure resource of binding molecules for the entire human proteome, together with tools for their use and applications in studying proteome function and organization. When mature, the resource could be similar in nature to the American Type Culture Collection (ATCC) for cell lines, making the reagents available at cost and with no restrictions for research use. In this Commentary, we present the long-term goals of the ProteomeBinders initiative as well as the current activities of the consortium. The current 4-year, 1.8M€ ($2.37M) initiative, funded by the European Commission 6th Framework Programme in the area of Research Infrastructures, is a 'Coordination Action' involving a network of 26 EU and 2 US partner institutions, leaders in the area of affinity-reagent production, characterization and application (see Supplementary Table 1 online for a list of the lead participants in the ProteomeBinders consortium).

The consortium will coordinate several complementary activities:

  1. 1

    Assessing the resources and methods required to develop a complete collection of binding molecules, from representation of the proteome in cDNA collections to binder selection and production.

  2. 2

    Reviewing the properties of different molecular types of binders, including natural and recombinant antibodies, scaffold domains, peptides and nucleic acid aptamers, and linking them with proteomics tools and applications in research, diagnostics and therapeutics.

  3. 3

    Establishing criteria and methods for universally applicable quality assessment and validation.

  4. 4

    Defining standards for data representation and establishing a bioinformatics platform to display information on characterization of individual binders.

  5. 5

    Planning the long-term production strategy and organization of the binder infrastructure.

The consortium will hold regular open workshops and disseminate information on the results of its activities through its website (http://www.proteomebinders.org), which will also contain a list of quality-assured binding reagents from all sources. In subsequent phases, the consortium aims to benchmark different types of binders and production methods against defined sets of proteins to select those most appropriate for various applications. Embarking on the task of assembling the resource by systematically collecting and/or creating the tens of thousands of reagents needed will require an application for much more substantial funds, for example, from the next European Commission Framework Program, FP7, starting in 2007.

Scale of the problem

The size of the human proteome (including splice variants, post-translational modifications, polymorphisms) is generally recognized to be at least an order of magnitude greater than the 24,000 protein-coding genes (http://www.ensembl.org). This raises several central questions, presently being debated within the ProteomeBinders consortium: 'Is comprehensive coverage of the proteome realistic?' 'How should targets be prioritized—by biological or medical relevance, by unknown function, or according to other criteria (chromosome, etc)?' Another factor in the scale of the task is that several binders against each target will be required, depending on the nature of the samples (denatured or native), the biological status of the protein (post-translationally modified or not) or the detection mode of the assay (for example, sandwich configurations).

Choices and challenges in developing a binder collection

The consortium will consider the following stages in binder production and application to reach a consensus for future action.

Target generation. Full-length proteins, probably the optimal targets for binder selection, can be expressed from cDNA collections2 using a variety of systems (bacterial3, insect, mammalian, cell-free), but there is often limited success in obtaining them in soluble, correctly folded form4,5. Protein fragments or peptides are alternatives, especially combined with large-scale epitope prediction. 'Protein epitope signature tags' (PrESTs)6 are genome-unique, nonrepetitive and nonhydrophobic protein subsequences, shown in the Swedish human proteome atlas project to be suitable for raising and affinity purifying polyclonal antibodies (http://www.proteinatlas.org)7. ProteomeBinders aims to integrate existing, but presently fragmented, bioinformatics tools into distributed and/or virtual facilities that specifically address target-site selection within proteins. In the context of constructing a large-scale binder resource, an important goal will be to reduce the amount of target required (and hence also the cost), for example, by using microarrays for selection or using micro- or nanotechnology for specificity and affinity assays8.

Molecular varieties of binders: antibodies and alternatives. Antibodies are by far the most familiar and best understood affinity reagents—and generally the researcher's first choice—but not the only ones. ProteomeBinders unites expertise both on antibodies and alternative binding reagents with antibody-like specificity and affinity, including nonimmunoglobulin scaffolds9,10 (Affibodies11, Anticalins12, designed Ankyrin repeat proteins13 and others), nucleic acid aptamers14, peptides and chemical entities (Box 1). Antibodies (including monoclonals, monospecific polyclonals, camelid heavy chains15, and recombinant scFv fragments and single domains) are considerably more difficult to produce compared to high-yield bacterial expression of some alternative scaffolds13. The latter are also more robust and provide opportunities for engineering of functional properties. Among the attractions of antibodies are widespread competence in technologies, access to large libraries for recombinant selection16 and availability of secondary reagents and detection systems. Recombinant binding molecules have the advantage of being completely described by their sequence, so that documentation and replication of experiments can be more objective. Though there may be a problem of wide acceptance of the alternatives to antibodies, users may well not be too concerned with the structure of the reagent, so long as it has been demonstrated to work in their particular application. Accordingly, the consortium will undertake benchmarking of the properties of alternative binders alongside conventional antibodies to define the 'right binder for the job'. Whatever the molecular species, sustainability will be a key factor: ultimately, a replenishable resource is required.

Binder production methods and scaling. The production of binding molecules for a systematic program can be contrasted in many respects with hypothesis-driven research (Table 1). For 'classical' antibodies, the production routes are either to raise polyclonals (purified for monospecificity) or hybridomas; throughput of monoclonals can be increased by immunization with antigen mixtures and selections on protein arrays17. For all molecular binder varieties, recombinant display library approaches can be applied and coupled with several possible selection methods. Systems with established track records such as phage display18, ribosome display19,20, cell-surface display, bacterial two-hybrid, functional colony screening, protein-fragment complementation21, SELEX22 for aptamer selection and combinations of these methods23 will be compared taking into account the molecular entity being selected and the intended downstream applications (Box 2). For example, if affinity is crucial, technologies with built-in evolution will be required (for example, ribosome display24), whereas if only 'some binder at some epitope' is needed, more technologies become available and selection is less stringent. For intracellular applications, binders should fold functionally in the reducing environment of the cell25. Other technology evaluation criteria include robustness, library creation and size, range of scaffolds that can be expressed, automation, and limits of scale and throughput. Contributions of the consortium will be to deliver effective protocols and actively check their robustness by rotating and annotating them between laboratories, and to identify potentially automatable steps, distinguishing those that are generic from the method-specific ones.

Table 1 Comparison of binder generation for 'classical' hypothesis-driven research and systematic approaches (adapted from ref. 36)

Characterization and quality control. The critical area of quality control is all too often sidelined. Validation will be a central issue; there will be a requirement to demonstrate the quality of selection methods and binder formats, as well as of each individual binder. Although different binder types may have superior characteristics for defined applications, certain criteria (affinity, specificity8 and cross-reactivity, native or denatured target, stability in vivo and in vitro, association and dissociation rate constants26,27) are applicable across all formats. The consortium will establish reference criteria for binder quality control and validation, which could eventually become a 'gold standard' in the research and commercial areas. Besides the classical tests of performance, for example, ELISA and western blotting, other validation methods range from protein and tissue microarrays to genomic correlations with transcript levels, gene knockouts, transgenesis and bioinformatic predictions (Table 2)7. High-throughput systems for initial specificity screens may be combined with advanced kinetic analyses for binder optimization28.

Table 2 Validation criteria and methods for proteome binding molecules (adapted from ref. 7)

Linking binders to tools and applications. The area of application is perhaps the most important criterion in choosing binder type, selection technology and characterization methods. Binder uses include high-throughput array methods (capture, tissue, lysate arrays) as well as the more classical techniques. For diagnostic and prognostic purposes, the pattern of information29 that can be gained from target-binder interaction is more important than specificity of the binding event, as long as reproducibility and correlation with disease are high. In contrast, for functional analyses, where the global proteome-wide approach is particularly applicable, specificity is essential. Accordingly, the parameters by which binders have to be evaluated can be very different. It is imperative to define applications beforehand and design or refine the selection process accordingly.

Novel methods to measure large sets of proteins using affinity reagents are becoming available, for example, fluorescence cross-correlation spectroscopy30 and proximity ligation approaches, coupled with DNA amplification31,32,33. Further new-generation techniques must be evolved, particularly to improve sensitivity and specificity of detection (ultimately down to the single-molecule or single-cell level), the ability to perform highly multiplexed assays in individual samples, and to determine the spatial distribution of large numbers of molecules in cells and tissues.

Bioinformatics resources. ProteomeBinders will develop community standards for binder data representation in collaboration with the Human Proteome Organization (HUPO) Proteomics Standards Initiative34 (http://psidev.sourceforge.net/). Crucial parameters of binder-target interactions have to be identified and an ontology of binder properties formally defined. These standards will be implemented in a comprehensive database of binders and other web resources, ideally in collaboration with other major binder providers. The database structure needs to anticipate the information to be captured, to allow retrieval of binders matching certain criteria and allow users a meaningful assessment of their performance. Basic searchable information should include the gene identifier of the target, the form of the target used for the binder generation or selection, a description of the molecular nature of the binder, the results from quality assurance and information about suggested applications and availability of the binder.

Intellectual property. Patent issues may well influence choices of binders and selection methods. Intellectual property rights to binders must be respected. It should be possible to find a model whereby the inventors of techniques for production of binders benefit from having their reagents selected by such a program. The relationship between an open access resource and commercial activities will doubtless be a topic of future debate.

Strategies for the long-term production phase

Clearly, new binders have to be made in very large numbers. For antibodies, outsourcing to multiple sites can be considered. After adequate assessment, existing binders from the research community and commercial suppliers can also be incorporated. However, quality control and characterization should be retained within the consortium to ensure standardization. When trying to match the available funding with the task of generating more than 100,000 reagents, it becomes clear that new solutions and technical optimizations must be found. High-throughput approaches must be adopted at all levels, from protein expression and binder production to multiplexed assays and multiparameter tests35, and will be an integral part of any design of a binder-generation pipeline36.

Networking in and beyond Europe

Other initiatives in the area of affinity reagents are the US National Cancer Institute proteome reagents program, focused mainly on cancer-related monoclonals37 (http://proteomics.cancer.gov), the HUPO antibody initiative (http://www.hupo.org/research/hai/) and the Swedish human proteome atlas project7 (http://www.proteinatlas.org). Antibody Factory (http://www.antibody-factory.de), is a German national initiative to develop high-throughput recombinant antibody methods35,38. Additionally, small-molecule library initiatives have as their ultimate goal generating ligands for every human protein7 function as anticipated by the US National Institutes of Health Molecular Libraries Screening Network39 (http://nihroadmap.nih.gov/molecularlibraries/), Germany's ChemBioNet (http://www.chembionet.de/) and the new Spanish ChemBioBank (http://www.pcb.ub.es/chembiobank/). At the moment, these initiatives are independent and in some cases complementary; however, given the scale of the problem, it would seem that in the future, the only way to tackle the task will be through coordination of activities.

Meanwhile, individual researchers have a critical role to play in shaping a resource that will eventually be theirs to use and they are expected to contribute to the effort by their comments and experience. The ProteomeBinders consortium welcomes comments and discussion from the wider community, via the website (http://www.proteomebinders.org) and participation in annual meetings (the next open workshop is in Alpbach, Austria, March 13–16 2007; details from oda.stoevesandt@bbsrc.ac.uk).

Note: Supplementary information is available on the Nature Methods website.