The human genome is littered with small-scale genetic variants, such as SNPs and repeat-length polymorphisms, but little is known about the way in which variants involving larger regions contribute to genetic diversity. Two studies now reveal that large deletions and duplications are more common than was previously thought, prompting a re-evaluation of the way we view human genetic variation.

Large-scale copy-number polymorphisms/variants (CNPs/LCVs) — deletions or duplications of chromosomal segments — have been identified previously from healthy individuals, but technical limitations have prevented an assessment of whether these variants are common on a genome-wide scale. In a collaborative study, Michael Wigler and colleagues developed a method called ROMA (representational oligonucleotide microarray analysis) that enables deletions or duplications to be identified. This involves digesting genomic DNA, amplifying the fragments, attaching a fluorescent label and hybridizing them to an array of complementary probes. The signal strength of each probe indicates the copy-number of the corresponding genomic region, which can be compared between samples. Using an average of 1 probe every 35 kb, Wigler and colleagues analysed samples from 20 unrelated, healthy individuals from a range of geographical locations. They identified a set of 76 different CNPs, involving regions of 100 kb or more, that varied between individuals, with an average of 11 CNP differences between each pair of subjects. The polymorphisms included both deletions and duplications — most of which have not been identified before — with a mean length of 465 kb. Most regions of the genome had CNPs, although they were noticeably more frequent in some regions, suggesting that there might be CNP 'hotspots'.

Importantly, many of these CNPs are in regions that contain genes, so this type of variation might influence levels of gene expression and lead to phenotypic differences between individuals. For example, one CNP-variant contained three copies of the gene PPYR1, which encodes the appetite-regulating neuropeptide Y4-receptor. CNPs are also present in regions that include genes implicated in nervous-system development, leukaemia and drug resistance. So, it is possible that large-scale CNPs might underlie variation in a diverse range of phenotypes, from body weight to cancer susceptibility.

In a second study, Charles Lee and colleagues used a similar technique to identify LCVs in samples from 39 healthy individuals. They identified 255 polymorphisms in the human genome: an average of 12 CNPs for each subject.

The authors of both papers point out that their studies are not comprehensive, as the probes that they used represent only a fraction of the genome. Studies using larger sets of probes are planned for the future, which should reveal the full extent to which large-scale polymorphisms contribute to the genetic differences that underlie human individuality.