Companies unveil data from their latest technologies.
Genome researchers gathered in a Florida hotel on 5 February, hoping to see whether companies that build 'third-generation' sequencing technologies can deliver on stunning claims such as sequencing human genomes in three minutes or selling them for $5,000. Although scientists were cautiously optimistic about the data unveiled, they still have major questions about how well this next generation of machines will work.
For one company, the Advances in Genome Biology and Technology meeting in Marco Island, Florida, was a major test. In October 2008, Complete Genomics of Mountain View, California, said it would sell whole human genomes in 2009 for $5,000, but it released no supporting data. At the conference, the company revealed a human genome it said it had sequenced using nine machines for eight days over Christmas.
The company's chief executive, Clifford Reid, says it assembled 254 gigabases (254 billion base pairs) of data into a draft covering 92% of the genome of an anonymous man, and that it read each base an average of 91 times. Like many of the high-speed sequencing technologies currently in use — usually called 'next-generation' technologies, as opposed to the third generation still in development — Complete Genomics produces short reads of DNA. By sequencing each base many times, it aims to diminish the potential errors that could creep in when the short reads are assembled into longer pieces. Reid says that the technology is highly accurate, with less than one-third of a per cent chance of making an error in any given base. That's comparable to the current generation of sequencers.
Complete Genomics is not selling sequencing machines, but instead performs all its work in-house on its own machines. This has made some scientists sceptical, but others have been encouraged by the company's data. "Their key thing is to show that they can have highly accurate base identification across the vast majority of the genome," says Chad Nusbaum, co-director of the sequencing centre at the Broad Institute in Cambridge, Massachusetts. "They're doing the right analyses, and it seems like things look pretty good."
Speed and cost have been Complete Genomics' main selling points; it did not reveal how much this particular genome cost, but says by June its materials cost will be down to $1,000 per genome. The company aims to launch commercially that month, sequence 1,000 genomes this year and 20,000 human genomes next year.
A few centres have now signed on for pilot projects in which Complete Genomics will sequence five genomes at $20,000 apiece. Many scientists are reserving judgement until they see the results. "The model of outsourcing for generating complex data sets makes scientists nervous," says Rick Myers, director of the HudsonAlpha Institute for Biotechnology in Huntsville, Alabama. "There's a lot of complexity in everybody's genome, and it will be important to have that right."
The quest for accuracy and speed
An hour before Complete Genomics' presentation, the chief technology officer of its crosstown rival, Pacific Biosciences of Menlo Park, spoke. Stephen Turner unveiled a completed genome of the bacterium Escherichia coli, saying the company had covered each base an average of 38 times for an accuracy of greater than 99.9999%.
Pacific Biosciences uses a single-molecule technology with DNA polymerase, the enzyme used by cells to assemble DNA strands, that reads out the product of the sequencing reaction as it progresses. Although its current machines read 3 bases per second, it aims to produce entire human genomes in under three minutes by 2013. It has also promised to deliver longer read lengths; Turner says the average read length of the E. coli genome was 586 base pairs, with some as long as 2,805 base pairs — "higher than any other read length in production," he says. Some scientists hope long read lengths will eliminate errors and allow them to see parts of the genome that are difficult to read.
Pacific Biosciences intends to launch commercially late next year. Meanwhile, current sequencing technologies — such as those sold by Illumina, based in San Diego; Applied Biosystems, from Foster City, California; and Roche, based in Basel, Switzerland — are pouring out data at an astonishing rate, delivering multiple human genomes' worth of data in a single multi-day run. That rate continues to increase as prices drop; Illumina, for instance, said at the conference that its technology will be able to sequence human genomes for as little as $10,000 by the end of this year. The current companies, says Nusbaum, are "not going down without a fight".
Not all entrants in the crowded sequencing market are faring well. Third-generation machines made by Helicos Biosciences of Cambridge, Massachusetts, have been dogged by sequencing errors. And just days before the conference began, Helicos revealed that its first customer had returned one of the few machines it has sold. At the meeting, the company said that it had assembled the genome of a Caenorhabditis elegans nematode worm. But its troubled history and the high cost of its reagents and machine — just reduced to $999,999, compared with around half a million dollars for other next-generation machines — have eroded the confidence of many scientists.
In a pre-meeting workshop, John McPherson of the Ontario Cancer Institute in Toronto, Canada, summarized the general feeling: "They were a pioneer in single molecule sequencing, but I think they've fallen short of their goals." Yet Helicos' chief technological officer William Efcavitch dismisses the critics. "The rumours of our demise are greatly exaggerated," he said at the conference.
Many scientists hope he is right, and that Helicos and many other companies keep competing to deliver more data at lower prices. "The competition has been really healthy," Myers said. "It seems like a miracle that we can get 80 million sequence reads in a few days now, but no matter how well [the companies] do, we want more."
About this article
Cite this article
Check Hayden, E. Genome sequencing: the third generation. Nature (2009). https://doi.org/10.1038/news.2009.86
TRAPLINE: a standardized and automated pipeline for RNA sequencing data analysis, evaluation and annotation
BMC Bioinformatics (2016)
Quantitative Biology (2016)
International Journal of Clinical Pharmacy (2013)
BMC Medical Genomics (2012)