Trolling through 270 people's DNA to identify gains and losses of genetic sequence is a daunting prospect. Five groups identified and mapped these 'copy number variants', which probably play key roles in genetic diseases (see page 444). But making sure the data captured real genetic variance — and not artefacts from in vitro samples or automated data analysis — proved the biggest challenge, says Matt Hurles of the Wellcome Trust Sanger Institute in Cambridge, UK.”We had to invent the quality control,” he says. “These samples are all derived from cell lines, which sometimes rearrange their genomes.”

To avoid artefacts, Charles Lee's group at Brigham and Women's Hospital in Boston, Massachusetts, with Stephen Scherer's team at the Hospital for Sick Children in Toronto, Canada, grew the 270 cell lines, examined them at key stages of division and sorted out discrepancies that seemed a result of cell culturing rather than real copy number variants.

They found obvious problems in 30 cell lines, the most common being signs of three rather than two chromosomes in some cells — a near-impossibility, says Hurles, “you just don't see these particular extra chromosomes in people”. Don Conrad of the University of Chicago, Illinois, then used family information to sort out less obvious discrepancies.