Nature Genetics
29, 233 - 237 (2001)
doi:10.1038/ng1001-233
Haplotype tagging for the identification of common disease genesGillian C.L. Johnson1, Laura Esposito1, Bryan J. Barratt1, Annabel N. Smith1, Joanne Heward2, Gianfranco Di Genova1, Hironori Ueda1, Heather J. Cordell1, Iain A. Eaves1, Frank Dudbridge1, Rebecca C.J. Twells1, Felicity Payne1, Wil Hughes1, Sarah Nutland1, Helen Stevens1, Phillipa Carr1, Eva Tuomilehto-Wolf3, Jaakko Tuomilehto3, 4, Stephen C.L. Gough2, David G. Clayton1
& John A. Todd11
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/Medical Research Council Building, Hills Road, Cambridge, UK. 2
Department of Medicine, University of Birmingham and Birmingham Heartlands and Queen Elizabeth Hospitals, Birmingham, UK. 3
Diabetes and Genetic Epidemiology Unit, National Public Health Institute, Helsinki, Finland. 4
Department of Public Health, University of Helsinki, Mannerheimintie, Helsinki, Finland.
Correspondence should be addressed to John A. Todd john.todd@cimr.cam.ac.uk.Genome-wide linkage disequilibrium (LD) mapping of common disease genes could be more powerful than linkage analysis if the appropriate density of polymorphic markers were known and if the genotyping effort and cost of producing such an LD map could be reduced. Although different metrics that measure the extent of LD have been evaluated1,
2,
3, even the most recent studies2,
4 have not placed significant emphasis on the most informative and cost-effective method of LD mappingthat based on haplotypes. We have scanned 135 kb of DNA from nine genes, genotyped 122 single-nucleotide polymorphisms (SNPs; approximately 184,000 genotypes) and determined the common haplotypes in a minimum of 384 European individuals for each gene. Here we show how knowledge of the common haplotypes and the SNPs that tag them can be used to (i) explain the often complex patterns of LD between adjacent markers, (ii) reduce genotyping significantly (in this case from 122 to 34 SNPs), (iii) scan the common variation of a gene sensitively and comprehensively and (iv) provide key fine-mapping data within regions of strong LD. Our results also indicate that, at least for the genes studied here, the current version of dbSNP would have been of limited utility for LD mapping because many common haplotypes could not be defined. A directed re-sequencing effort of the approximately 10% of the genome in or near genes in the major ethnic groups would aid the systematic evaluation of the common variant model of common disease.
|