London

How many genes in the human genome? 142,634 according to Incyte Pharmaceuticals. Credit: J. C. REVY/SCIENCE PHOTO LIBRARY

One of the leading private-sector participants in US genome sequencing efforts says he has firm evidence that there may be more than 140,000 genes in the human genome — a significant increase over conventional estimates closer to 100,000.

Randall Scott, president and chief scientific officer of the biotechnology company Incyte Pharmaceuticals, suggested this new figure during a presentation on Monday (20 September) to the annual sequencing conference organized in Miami, Florida, by the Institute for Genomic Research (TIGR).

His announcement coincides with the news that the chief scientific advisers to the US and British governments, Neal Lane and Robert May respectively, have been discussing a joint declaration underlining the commitment of their governments to public access to raw sequence data.

Scott's new estimate of the number of human genes will come as little surprise to most researchers. Already one result of sequencing work on other organisms, for example the fruitfly Drosophila, has been to increase the estimate of the number of genes these organisms contain, usually by a factor of about 20 per cent.

Nevertheless, the higher number will be of considerable interest to geneticists, particularly as it follows other suggestions that the total number of nucleotide bases in the genome is considerably more than the figure of three billion usually quoted in debates on sequencing projects.

Scott's new estimate of the total number of human genes is based on an analysis of the prevalence in genes of CpG islands — short stretches of DNA that can be methylated and as such provide a mechanism for controlling gene expression (for full details see http://www.incyte.com).

Incyte has already produced a large bank of sequence data that is made available for searching by other companies and research institutions on a contract basis. Incyte researchers have sequenced just under half of the CpG islands and compared them to the finished sequences of genes now available.

“This has allowed us to estimate that, overall, 53 per cent of genes have CpG islands associated with them,” says Scott. A further estimation that there are just over 75,000 CpG islands in total in the genome has led Scott and his team to predict a total of 142,634 genes.

“Previous recent estimates appear to have been substantially lower than this because they have overestimated the frequency of CpG islands in genes,” he says. “One of the implications is that the genome is even more complex than we originally thought.”

Incyte's calculations are likely to increase interest in the question of patenting of sequencing data, which produced headlines in Britain this week when The Guardian revealed the discussions between Washington and London on a possible joint declaration on access to sequence data. The report was later confirmed by the Department of Trade and Industry.

Officials on both sides of the Atlantic were quick to point out, however, that such a declaration would be aimed primarily at ensuring rapid access to raw sequence data, rather than preventing the patenting of genetic data as such (including the right to patent a specific gene when its sequence data is linked to a specified application).

The move has been welcomed by Britain's Wellcome Trust, which is sponsoring one third of the total human genome sequencing effort. The trust has been insistent — together with its US partners — that one condition of its support is that all such data is made publicly available within 24 hours.

John Sulston, director of the Sanger Centre near Cambridge, which is jointly funded by Wellcome and the Medical Research Council, says: “This data must be shared and controlled by all, and I strongly endorse the idea of giving the human genome an ‘international ownership’ flavour in this way“.