The major players in publicly funded efforts to sequence the human genome are proposing to create during the next three years a ‘rough draft’ of the genome that would be about 95 per cent complete.
Such an initiative would add a new element to a strategic plan that is being drawn up to guide the Human Genome Project (HGP) in the next five years. The sequencers say a ‘rough draft’ would allow other scientists to proceed more rapidly with projects that apply new-found sequence data, from discovering rare disease genes to pinning down the molecular details of disease genes that have been mapped but not yet identified.
It would also be different to the ‘whole genome shotgun’ sequencing approach recently announced by J. Craig Venter and the equipment manufacturer Perkin-Elmer, (see Nature 393, 101, 296; 1998 Nature 393, 201, 296; 1998).
“From the point of view of identifying genes, [the ‘rough draft’] would be absolutely fantastic,” said Leroy Hood of the the University of Washington in Seattle during a two-day meeting in Virginia last week to refine details of the five-year plan.
The new strategy became a focal point of the meeting, with participants endorsing the idea that experiments should begin to see how feasible and accurate it would be. Francis Collins, the director of the National Human Genome Research Institute (NHGRI), said the strategy would speed up the process of getting “very useful” sequence “in the hands of the people who want it”.
But Collins, whose institute is the main government funder of the genome project and also sponsored the meeting, emphasized that the rough draft “is not a substitute for finishing the whole thing”.
Federally funded sequencers are using a two-part process to sequence the genome by analysing mapped clones covering the human chromosomes. The first step is ‘shotgun’ sequencing and assembly of random fragments from each clone; the second is the tedious and expensive process of ‘finishing’ by closing gaps and resolving uncertainties.
The new proposal involves dramatically accelerating the shotgun phase, which is the simplest and cheapest component (presently only 10 cents per base). Researchers estimate that this will require a two- to two-and-a-half-fold increase in the government's sequencing capacity, and could produce a high-quality rough draft by 2001. The complete version should still be finished by 2005.
The approach of Venter and Perkin-Elmer breaks the genome into random unmapped fragments for sequencing. But some scientists doubt whether the whole human genome, 70 per cent of which consists of highly repetitive sequences, can be reassembled.
The government's approach requires assembly of much smaller regions, each about one twenty-thousandth of the genome, decreasing the possibilities for error by a factor of 400 million.
The participants at the meeting said their project would mesh well with the private venture by providing accurately assembled sequence across the genome, which the private venture's own sequence data could fill out. “It may be that the two together will be adequate to finish the job. If that's the case, everybody wins,” said one sequencer.
But despite excitement at the prospect of producing a useful ‘intermediate product’ in three years, some sequencers thought the project could distract them from the primary goal of producing a complete, highly accurate, finished human sequence by 2005. The sequence-ready maps needed for shotgun sequencing of clones do not yet exist across the entire genome, and their construction could distract sequencing centres that are already stretched to meet the 2005 goal.
Nor is it clear whether Congress will allocate sufficient funds to double sequencing capacity quickly, even though government sequencers have said their labs could handle the extra capacity.
But nearly all participants agreed that the costs of raising sequencing capacity are modest compared with the long-term stakes and that more government sequencing capacity is needed urgently.
“More would be better, sooner rather than later,” said Richard Lifton, a Howard Hughes Medical Institute investigator at the Yale University School of Medicine, who co-chaired the sequencing strategy session.
Collins says that the federal project is “massively undercapitalized” with regard to sequencing capacity. The consequences of not responding to that, he said, are not only that that the finished human sequence would not be produced quickly enough, but that medical advances and the sequencing of other animals would be delayed.
He also says a push for faster sequencing makes sense on economic grounds alone: a “very rough” estimate of the current annual costs of US public and private cloning and sequencing efforts is $500 million.
The draft five-year-plan circulated by NHGRI to participants — which will be modified by their input and released in final form in October — calls for the generation of 500 million base pairs of sequence per year by 2003, a tenfold increase on current levels.
Participants said their focus on the shotgun strategy, and boosting speed and capacity generally, was not influenced by the Perkin-Elmer/Venter venture. Rather than federal plans being “ignited” by the private effort, said Lifton, “the same opportunities that led to this new enterprise also led to realization within the Genome Project that there was a real opportunity to accelerate” sequencing.
As if to show that the public and private efforts are cooperating, Michael Hunkapiller, president of Perkin-Elmer's Applied Biosciences Division and its top official on the private venture, and Mark Adams, who works with Venter at The Institute for Genomic Research in Rockville, Maryland, were at the conference.
Also present were investigators from the federally funded sequencing laboratories, other university scientists, and officials from NIH and the Department of Energy, the other major government funder of the HGP, along with representatives of Britain's Wellcome Trust, which recently agreed to fund one third of the total sequencing effort of the human genome (see Nature 393, 201; 1998).
Hunkapiller fielded pointed questions about Perkin-Elmer's intellectual property intentions, and one federal sequencer predicted that the product of the new venture would be riddled with errors, but the overall tone of the gathering was collegial.
Harold Varmus, director of the National Institutes of Health (NIH), declared his pleasure at the private-sector participation, and urged federal scientists to make the public and private efforts “consensusal, communal, collaborative and productive”. He added: “There's every reason to believe that we can do that.” Hunkapiller said that for the two efforts not to collaborate is “absurd”.
The private group promised prompt and full public access to its raw sequence data, but publicly funded scientists and federal officials remain wary. Collins said that, although the venture's leadership was “reassuring” on the question of public access, he worries that “down the road somebody else who ends up as calling the shots may see this as too much of a give-away and retreat”.
About this article
The Bermuda Triangle: The Pragmatics, Policies, and Principles for Data Sharing in the History of the Human Genome Project
Journal of the History of Biology (2018)
Science Communication (2010)
Science as Culture (2006)
INFOGENE: a database of known gene structures and predicted genes and proteins in sequences of genome sequencing projects
Nucleic Acids Research (1999)
Trends in Genetics (1998)