Main

The proteomics field has generated an overwhelming amount of data. These data are a rich resource, but a major difficulty is how to best represent these data and make them available for the wider biological community to use.

HUPO-PSI has taken it upon itself to begin to tackle this immense challenge, with its publication of guidelines for reporting the 'minimum information about a proteomics experiment' (MIAPE; Taylor et al., 2007). This document discusses the importance of reporting where the samples came from and how the analyses were performed. “This is an important first step in overcoming the current fragmentation of proteomics data,” says Henning Hermjakob of the European Bioinformatics Institute and chair of the PSI steering committee.

Though most researchers would agree that the proper annotation of proteomics data and its subsequent deposition into databases is crucial for facilitating data sharing and minimizing error and data loss, this process can be a burden for many researchers. PSI is therefore in the process of developing a series of MIAPE modules that will provide specific reporting guidelines for different proteomics technologies.

The first of these modules, the 'minimum information required for reporting a molecular interaction experiment' (MIMIx; Orchard et al., 2007), provides a checklist of information that should be reported for molecular interaction data. For example, database accession numbers should be provided for all molecular players, and their biological and experimental roles should be classified. Although reporting this sort of information is good common sense, it is all too often not included.

Ideally, all proteomics data would be made available in public databases in standardized data formats, but the software to make this possible is not yet widely available, and the existing databases are not yet equipped to handle a sudden influx of new data. Hermjakob believes that a network of collaborating databases will be most successful, “so that you have redundancy built in, such that if one resource goes away the others can continue the operation.”

“Hopefully within a reasonable time frame we will have a more or less full and complete record of the publicly available data,” says Hermjakob. To realize this goal, PSI will need the support of the proteomics community as well as the funding agencies holding the purse strings.