Genome sequencing pioneer Craig Venter last month left Wall Street bewitched, his critics bothered, and much of the press bewildered with news that his company, Celera Genomics, has placed sequence data in its database “covering 90% of the human genome.”

Celera's shares were already in favor with investors who had, for the previous few weeks, been buying heavily into genomics companies as a potentially lucrative investment. Within hours of the announcement, the shares shot up in value by almost 40%, from what had already been a record high of $186 to $258, before settling back later in the day to $242.

Those who argue that Venter's ‘shotgun’ approach to sequencing has limitations that can best be met by close collaboration with publicly funded, clone-by-clone strategies claimed that their perspective had been vindicated. This was based on Venter's admission that he had made heavy use of public data in reaching his self-declared “milestone.”

But many were left asking whether it was reasonable that whereas Celera—like the rest of the research community—enjoys free access to the public data, the arrangement is not reciprocal. A statement from the company, although confirming that it eventually anticipated all the data being published openly in the scientific literature, also emphasized that the data were being made available “under a non-redistribution agreement” to Celera database subscribers.

The mass media seems confused by the significance of the company's statement that it had “compiled DNA sequence covering 90% of the human genome.” A number of newspaper articles, including, for example, one in the Financial Times of London, subsequently reported that the company “had mapped 90% of the human genome.” Some claimed that this gave it a substantial lead over publicly funded sequencing efforts, which are due to produce their own ‘rough draft’ of the genome in the summer. Few reported that this 90% included data obtained by Celera from the public databases, or that the company's own sequenced databases—covering 81% of the estimated 3.2 billion base pairs—needed the public data to be properly ‘finished’.

Despite the reservations, the figures produced by Venter at his press conference were impressive. For example, he stated that the company's statistical analysis and comparison with known genes suggested that “greater than 97% of all human genes are represented in our database” (even if, as others points out, some are present in highly fragmented form).