Sir

The HUGO Gene Nomenclature Committee was pleased to note that Nature has, not before time, decided to be more rigorous in requiring authors to use standard nomenclature for genes and proteins (Opinion, Nature 401, 411; 1999). A growing number of journals now realize the contribution they can make to reducing confusion in the published literature and databases.

Our committee provides a searchable list of human gene symbols, including both the approved symbols and many published synonyms or aliases (www.gene.ucl.ac.uk/nomenclature). It is true that nomenclature authorities are finding it hard to keep pace with the overwhelming amount of data. However, the attitude in the research community should not be ‘someone should do something’, but ‘what can I do?’. Scientists, journal editors, database administrators and funding bodies all need to be nomenclature-aware.

It should not be too difficult for most people to find the nomenclature authorities, and the standards and guidelines, for their areas of interest. Far too often, problems arise because editors and authors are ignorant of established standards. Authors have argued against using a previously agreed symbol when describing the same gene in a new publication “because it would detract from the novelty of their results”. We have also seen journals publish papers about a “new” gene, whereas if they had contacted us we could have established that the gene had been previously described, but with a different name. Journals are in the best position to enforce nomenclature standards. Researchers are not obliged to submit data to databases, correct existing database entries, or use approved nomenclature — unless they have to do so as a requirement of journal publication.

It was suggested that databases need to do more to help standardize nomenclature. Primary sequence databases (GenBank, EMBL, DDBJ) would have a major problem enforcing standard nomenclature. The volume of data submitted to them precludes all except automated initial checks. It would be possible to check that any gene symbol in the annotation was approved, but ensuring that it was the correct symbol for that sequence would be much more complex. An incorrect, but apparently approved, gene symbol attached to a sequence record is worse than useless.

It is at the level of the secondary, curated databases that much more is being done to ensure that references, sequences and related data are linked to the approved nomenclature as well as to known aliases. Even with extensive collaboration, this labour-intensive process is under considerable pressure. Recent projects, such as LocusLink at the US National Center for Biotechnology Information, are helping to improve links between related databases, and are bringing the issue of approved nomenclature to the attention of a wider community. The downside of this is more pressure on curators to deal with queries.

Funding bodies should consider whether sufficient funds are being allocated to the curation effort, as without it much effort is wasted trying to find relevant data.