Variation among Arabidopsis strains can reflect genetic adaptation to their local habitat. Credit: D. Weigel

Joseph Ecker's teenage son listened intently as his father told him about the 1000 Genomes Project, which aims to sequence and compare the genomes of 1,000 people. Ecker, a molecular geneticist, explained that he and his colleagues were launching a similar project for the plant Arabidopsis thaliana. "My son said, 'Well then you should sequence 1,001'," Ecker recalls. "He's a very competitive kid."

And so the Arabidopsis '1001 Genome Project' was born. More than four years later, a loose confederation of laboratories is on the verge of making that challenge a reality. Papers published online in Nature1 and Nature Genetics2 this week report the sequencing of nearly 100 A. thaliana genomes, the first swathe released by the project; around 400 more have been sequenced, but are not yet ready for publication. Last week, Ecker's group at the Salk Institute in La Jolla, California, won a US$2-million grant from the National Science Foundation (NSF) to polish off another 500 strains, and to catalogue expressed RNAs and map DNA methylation, a chemical modification that affects gene expression.

Arabidopsis thaliana, or thale cress, is a small weed with a simple genome that stands in as a genetic reference for plants with more complex genomes. The genome project aims to uncover genetic changes that enable plants to adapt to their local environments. There are thousands of strains of A. thaliana in stocks worldwide, each of which might carry unique traits that helped it to thrive in its natural environment — tolerance for drought, perhaps, or defences against viral pathogens. "If you learn which genes are important for these traits, you could breed them into crops — to allow them to move into a new environment or continue to succeed where they face climate change," Ecker says.

The mining of natural variation for genetic information has gained momentum as faster DNA sequencing has delivered multiple genomes from wild populations. Similar projects are under way in mice, fruitflies, rice and, of course, humans. "If you go into nature, you find all these fascinating mutations that have survived the sieve of natural selection," says geneticist Trudy Mackay of North Carolina State University in Raleigh, who leads the work in fruitflies. "But in the past we've been hampered in our ability to tease them apart."

The 1001 Genome Project has had some problems, however. Unable to get funding for a single project, participating labs went their own ways, getting grants from a variety of sources, says Detlef Weigel, a plant biologist at the Max Planck Institute for Developmental Biology in Tübingen, Germany, who has spearheaded the project. The result was a fragmented effort, with each group sequencing strains and using techniques that best fitted its own research.

And Ecker frets that this ad hoc coalition won't even have a central place to deposit and organize its data. Arabidopsis researchers have relied on The Arabidopsis Information Resource (TAIR), but NSF funding for that project is being phased out and its fate is unclear. "We don't want to have these data scattered all over the place," says Ecker, "but there may be nowhere to put them."