Article

European Journal of Human Genetics (2008) 16, 1413–1429; doi:10.1038/ejhg.2008.210

Investigation of the fine structure of European populations with applications to disease association studies

Simon C Heath1, Ivo G Gut1, Paul Brennan2, James D McKay2, Vladimir Bencko3, Eleonora Fabianova4, Lenka Foretova5, Michael Georges6, Vladimir Janout7, Michael Kabesch8, Hans E Krokan9, Maiken B Elvestad9, Jolanta Lissowska10, Dana Mates11, Peter Rudnai12, Frank Skorpen13, Stefan Schreiber14, José M Soria15, Ann-Christine Syvänen16, Pierre Meneton17, Serge Herçberg18, Pilar Galan18, Neonilia Szeszenia-Dabrowska19, David Zaridze20, Emmanuel Génin21, Lon R Cardon22 and Mark Lathrop1,23

  1. 1Centre National de Genotypage, Institut Genomique, Commissariat à l'énergie Atomique, Evry, France
  2. 2International Agency for Research on Cancer (IARC), Lyon, France
  3. 3First Faculty of Medicine, Institute of Hygiene and Epidemiology, Charles University in Prague, Prague, Czech Republic
  4. 4Specialized Institute of Hygiene and Epidemiology, Banska Bystrica, Slovakia
  5. 5Department of Cancer Epidemiology and Genetics, Masaryk Memorial Cancer Institute, Brno, Czech Republic
  6. 6Unit of Animal Genomics, Faculty of Veterinary Medicine, GIGA-Research and Department of Animal Sciences, University of Liège, Liège, Belgium
  7. 7Department of Preventive Medicine, Palacky University, Olomouc, Czech Republic
  8. 8University Children's Hospital, Ludwig Maximilian's University Munich, Munich, Germany
  9. 9Faculty of Medicine, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
  10. 10M. Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
  11. 11Institute of Public Health, Bucharest, Romania
  12. 12National Institute of Environmental Health, Budapest, Hungary
  13. 13Faculty of Medicine, Department of Laboratory Medicine, Children's and Women's Health, Norwegian University of Science and Technology, Trondheim, Norway
  14. 14Institute for Clinical Molecular Biology, PopGen biobank, Christian-Albrechts-University, Kiel, Germany
  15. 15Unitat de Genómica de Malaties Complexes, Institut de Recerca de l'Hospital de la Santa Creu i Sant Pau, C/Sant Antoni Ma Claret, 167, Barcelona, Spain
  16. 16Molecular Medicine, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
  17. 17INSERM U872, Centre de Recherche des Cordeliers, Paris, France
  18. 18INSERM U557 et Unité de Recherche de Epidémiologie Nutritionnelle, 74 rue Marcel Cachin, Bobigny Cedex, France
  19. 19Department of Epidemiology, Institute of Occupational Medicine, Lodz, Poland
  20. 20Cancer Research Centre, Institute of Carcinogenesis, Moscow, Russia
  21. 21INSERM U794, Fondation Jean Dausset-CEPH, Paris, France
  22. 22GlaxoSmithKline, 709 Swedeland Road, King of Prussia, Pennsylvania, USA
  23. 23Fondation Jean Dausset-CEPH, Paris, France

Correspondence: Dr SC Heath, Centre National de Génotypage, 2, Rue Gaston Crémieux, 154 rue du Fbg. St Denis, Evry 91000, France. Tel: +160878402; Fax: +160878485; E-mail: simon.heath@gmail.com

Received 8 August 2008; Revised 8 October 2008; Accepted 9 October 2008.

Top

Abstract

An investigation into fine-scale European population structure was carried out using high-density genetic variation on nearly 6000 individuals originating from across Europe. The individuals were collected as control samples and were genotyped with more than 300 000 SNPs in genome-wide association studies using the Illumina Infinium platform. A major East–West gradient from Russian (Moscow) samples to Spanish samples was identified as the first principal component (PC) of the genetic diversity. The second PC identified a North–South gradient from Norway and Sweden to Romania and Spain. Variation of frequencies at markers in three separate genomic regions, surrounding LCT, HLA and HERC2, were strongly associated with this gradient. The next 18 PCs also accounted for a significant proportion of genetic diversity observed in the sample. We present a method to predict the ethnic origin of samples by comparing the sample genotypes with those from a reference set of samples of known origin. These predictions can be performed using just summary information on the known samples, and individual genotype data are not required. We discuss issues raised by these data and analyses for association studies including the matching of case-only cohorts to appropriate pre-collected control samples for genome-wide association studies.

Keywords:

PCA, GWAS, European, population, association, structure

Top

MORE ARTICLES LIKE THIS

Extra navigation

.

naturejobs

ADVERTISEMENT