The genome-wide structure of the Jewish people

Journal name:
Nature
Volume:
466,
Pages:
238–242
Date published:
DOI:
doi:10.1038/nature09103
Received
Accepted
Published online

Contemporary Jews comprise an aggregate of ethno-religious communities whose worldwide members identify with each other through various shared religious, historical and cultural traditions1, 2. Historical evidence suggests common origins in the Middle East, followed by migrations leading to the establishment of communities of Jews in Europe, Africa and Asia, in what is termed the Jewish Diaspora3, 4, 5. This complex demographic history imposes special challenges in attempting to address the genetic structure of the Jewish people6. Although many genetic studies have shed light on Jewish origins and on diseases prevalent among Jewish communities, including studies focusing on uniparentally and biparentally inherited markers7, 8, 9, 10, 11, 12, 13, 14, 15, 16, genome-wide patterns of variation across the vast geographic span of Jewish Diaspora communities and their respective neighbours have yet to be addressed. Here we use high-density bead arrays to genotype individuals from 14 Jewish Diaspora communities and compare these patterns of genome-wide diversity with those from 69 Old World non-Jewish populations, of which 25 have not previously been reported. These samples were carefully chosen to provide comprehensive comparisons between Jewish and non-Jewish populations in the Diaspora, as well as with non-Jewish populations from the Middle East and north Africa. Principal component and structure-like analyses identify previously unrecognized genetic substructure within the Middle East. Most Jewish samples form a remarkably tight subcluster that overlies Druze and Cypriot samples but not samples from other Levantine populations or paired Diaspora host populations. In contrast, Ethiopian Jews (Beta Israel) and Indian Jews (Bene Israel and Cochini) cluster with neighbouring autochthonous populations in Ethiopia and western India, respectively, despite a clear paternal link between the Bene Israel and the Levant. These results cast light on the variegated genetic architecture of the Middle East, and trace the origins of most Jewish Diaspora communities to the Levant.

At a glance

Figures

  1. PCA of high-density array data.
    Figure 1: PCA of high-density array data.

    a, Scatter plot of Old World individuals, showing the first two principal components. Each ring corresponds to one individual and the colour indicates the region of origin (for the full figure see Supplementary Fig. 2). bd, A series of magnifications showing samples from Europe and the Middle East (b), Ethiopia (c) and south Asia (d). Each letter code (Supplementary Table 1) corresponds to one individual, and the colour indicates the geographic region of origin. In b, a polygon surrounding all of the individual samples belonging to a group designation highlights several population groups.

  2. PCA of west Eurasian high-density array data.
    Figure 2: PCA of west Eurasian high-density array data.

    Plot of kernel densities (Supplementary Note 2) for each population sample (n>10) was estimated on the basis of PC1 and PC2 coordinates in Supplementary Fig. 3. Individuals from these samples were plotted by using PC1 and PC2 coordinates and were overlaid with the plot of kernel density.

  3. Population structure inferred by ADMIXTURE analysis.
    Figure 3: Population structure inferred by ADMIXTURE analysis.

    Each individual is represented by a vertical (100%) stacked column of genetic components proportions shown in colour for K = 8. The Jewish communities are labelled in colour and bold. T and B further specify Sephardi Jews from Turkey and Bulgaria, respectively. Populations introduced for the first time in this study and analysed together with the Human Genome Diversity Panel18 data are marked with an asterisk.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Ben-Sasson, H. H. A History of the Jewish People (Harvard Univ. Press, 1976)
  2. De Lange, N. Atlas of the Jewish World (Phaidon Press, 1984)
  3. Mahler, R. A History of Modern Jewry (Schocken, 1971)
  4. Stillman, N. A. Jews of Arab Lands: A History and Source Book (Jewish Publication Society of America, 1979)
  5. Della Pergola, S. in Papers in Jewish Demography 1997 (eds Della Pergola, S. & Even, J.) 1133 (The Hebrew University of Jerusalem, 1997)
  6. Cavalli-Sforza, L. L., Menozzi, A. & Piazza, A. in The History and Geography of Human Genes 4 (Princeton Univ. Press, 1994)
  7. Bauchet, M. et al. Measuring European population stratification with microarray genotype data. Am. J. Hum. Genet. 80, 948956 (2007)
  8. Behar, D. M. et al. Counting the founders: the matrilineal genetic ancestry of the Jewish Diaspora. PLoS ONE 3, e2062 (2008)
  9. Hammer, M. F. et al. Jewish and Middle Eastern non-Jewish populations share a common pool of Y-chromosome biallelic haplotypes. Proc. Natl Acad. Sci. USA 97, 67696774 (2000)
  10. Kopelman, N. M. et al. Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations. BMC Genet. 10, 80 (2009)
  11. Need, A. C., Kasperaviciute, D., Cirulli, E. T. & Goldstein, D. B. A genome-wide genetic signature of Jewish ancestry perfectly separates individuals with and without full Jewish ancestry in a large random sample of European Americans. Genome Biol. 10, R7 (2009)
  12. Olshen, A. B. et al. Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping. BMC Genet. 9, 14 (2008)
  13. Ostrer, H. A genetic profile of contemporary Jewish populations. Nature Rev. Genet. 2, 891898 (2001)
  14. Price, A. L. et al. Discerning the ancestry of European Americans in genetic association studies. PLoS Genet. 4, e236 (2008)
  15. Seldin, M. F. et al. European population substructure: clustering of northern and southern populations. PLoS Genet. 2, e143 (2006)
  16. Tian, C. et al. Analysis and application of European genetic substructure using 300K SNP information. PLoS Genet. 4, e4 (2008)
  17. Abdulla, M. A. et al. Mapping human genetic diversity in Asia. Science 326, 15411545 (2009)
  18. Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 11001104 (2008)
  19. Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 9981003 (2008)
  20. Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98101 (2008)
  21. Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489494 (2009)
  22. Biswas, S., Scheinfeldt, L. B. & Akey, J. M. Genome-wide insights into the patterns and determinants of fine-scale population structure in humans. Am. J. Hum. Genet. 84, 641650 (2009)
  23. Tishkoff, S. A. et al. The genetic structure and history of Africans and African Americans. Science 324, 10351044 (2009)
  24. Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006)
  25. Hourani, A. A History of the Arab Peoples (Faber & Faber, 1991)
  26. Weiss, K. M. & Long, J. C. Non-Darwinian estimation: my ancestors, my genes’ ancestors. Genome Res. 19, 703710 (2009)
  27. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 16551664 (2009)
  28. Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757762 (2010)
  29. Gao, X. & Martin, E. R. Using allele sharing distance for detecting human population stratification. Hum. Hered. 68, 182191 (2009)
  30. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559575 (2007)

Download references

Author information

  1. These authors contributed equally to this work.

    • Doron M. Behar,
    • Bayazit Yunusbayev &
    • Mait Metspalu

Affiliations

  1. Molecular Medicine Laboratory, Rambam Health Care Campus, Haifa 31096, Israel

    • Doron M. Behar,
    • Guennady Yudkovsky &
    • Karl Skorecki
  2. Estonian Biocentre and Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia

    • Doron M. Behar,
    • Bayazit Yunusbayev,
    • Mait Metspalu,
    • Ene Metspalu,
    • Jüri Parik,
    • Siiri Rootsi,
    • Gyaneshwer Chaubey,
    • Ildus Kutuev &
    • Richard Villems
  3. Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Ufa 450054, Russia

    • Bayazit Yunusbayev,
    • Ildus Kutuev &
    • Elza K. Khusnutdinova
  4. Department of Statistics and Operations Research, School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel

    • Saharon Rosset
  5. Rappaport Faculty of Medicine and Research Institute, Technion – Israel Institute of Technology, Haifa 31096, Israel

    • Guennady Yudkovsky &
    • Karl Skorecki
  6. Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moscow 115478, Russia

    • Oleg Balanovsky
  7. Dipartimento di Genetica e Microbiologia, Università di Pavia, Pavia 27100, Italy

    • Ornella Semino
  8. Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto 4200-465, Portugal

    • Luisa Pereira
  9. Faculdade de Medicina, Universidade do Porto, Porto 4200-319, Portugal

    • Luisa Pereira
  10. Institute of Evolutionary Biology (CSIC-UPF), CEXS-UPF-PRBB and CIBER de Epidemiología y Salud Pública, Barcelona 08003, Spain

    • David Comas
  11. Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel

    • David Gurwitz &
    • Batsheva Bonne-Tamir
  12. Department of the Languages and Cultures of the Near and Middle East, Faculty of Languages and Cultures, School of Oriental and African Studies (SOAS), University of London, London WC1H 0XG, UK

    • Tudor Parfitt
  13. ARL Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA

    • Michael F. Hammer

Contributions

D.M.B. and R.V. conceived and designed the study. B.B.T., D.C., D.G., D.M.B., E.K.K., G.C., I.K., L.P., M.F.H., O.B., O.S., T.P. and R.V. provided DNA samples to this study. E.M., J.P. and G.Y. screened and prepared the samples for the autosomal genotyping. D.M.B., E.M., G.C., M.F.H. and Si.R. generated and summarized the database for the uniparental analysis. B.Y., M.M. and Sa.R. designed and applied the modelling methodology and statistical analysis. T.P. provided expert input regarding the relevant historical aspects. B.Y., D.M.B., K.S., M.F.H., M.M., R.V. and Sa.R. wrote the paper. B.Y., D.M.B. and M.M. contributed equally to the paper. All authors discussed the results and commented on the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

The array data described in this paper are deposited in the Gene Expression Omnibus under accession number GSE21478.

Author details

Supplementary information

PDF files

  1. Supplementary Information (1.1M)

    This file contains Supplementary Notes 1-6, References and Supplementary Tables 1-5.

  2. Supplementary Figures (6.5M)

    This file contains Supplementary Figures 1 and 3-6 and legends for Supplementary Figures 1-6 (see separate file for Supplementary Figure 2)

  3. Supplementary Figure 2 (1.8M)

    This file shows the Principal Component Analysis of the Old World High-Density Array Data. a, Scatter plot of Old World individuals, showing the first two principal components. Here, the first PC (4.2% of variation, vertical axis) captures primarily differences between sub-Saharan Africans and the rest of the Old World. The second PC (3.4% of variation, horizontal axis) differentiates West Eurasians from South and East Asians. Axes of variation were scaled according to eigenvalues. Each letter code (Supplementary Table 1) corresponds to one individual and the colour indicates population origin. b, Scatter plot of Old World individuals, showing PC1 and PC3. c, Scatter plot of Old World individuals, showing PC1 and PC4. Note that eigenvalues for PC3 and PC4 are ~8 times smaller than for PC1 and 2.

Comments

  1. Report this comment #11635

    Jihad Orabi said:

    Jews Yemenite samples are located closer to Saudi and Bedouins populations i.e. same geographical area. Similarly, Ashkenazi and Georgian Jew cluster closer to the Georgian and Armenian population, i. e. same geographical area. Even the Moroccan Jews cluster outside of the Levant populations. From PCA it is clear that Jews outside Levant cluster closer to their native population rather than to the Jews or non Jews from Levant.

  2. Report this comment #13354

    Otero Hector Horacio said:

    In my hypothesis, the "L2" mtDNA marker is present in the two populations also the derived and sibling mtDNA Hg "M" and "N", as well as the Y markers Hg E3b and 4s too, all of this from East Africa and so. They belong respectively at one of the three nucleous or center jewish ancient populations, that evolving the called "Syrian-European nucleous"(helenistic and Roman times).
    The oldest center Ethiopians belong these were that developed in Napata and Elephantine (Kush) and whose nucleous or center was after Alexandria, and I called "Coptic Nucleous" derived in two bias, and split forwards the North via Europe intermixed with the Syrian Europe nucleous or the South, via Nile and the Horn Of Africa.
    The "Babilonian and Persian nucleous" is other of the above three mentioned centers and included Bukara, Iranian and Iraki mainly.
    All of this Nucleous take Judaea and Israel like a axis and pendulo.
    Another fourth Nucleous or center I call "East Europe" not mainly conected with ME, is not ancient like the three others and was the Jewish Khazar Empire stiring into Askenazy current population and others. All of this events were naturaly intrajewish asimilations in all jews current populations.
    The Ashkenazim hyperhaploydia is explained by the superposition and overlay of diverse fount or source population , that are all of this of Jewish origin (that consider converted into intraJewish assimilations) , one coming from the ?Syrian European nucleous? ? that Sephardic as well as preAshenazim bring inside -. The other convergence were the ?Coptic Jewish nucleous?, coming from Alexandria, the main and largest Judaic center in ancient times ? the buried and graves in Jewish graveyards and catacombs of Tuscan, and Alsace as too Rhineland cities take a lot of Egyptian ornaments and display figures from these, as well as Y and mtDNA markers - . The great Jews migration from Egypt beginning after the Muslim invaders from Arabia in the VII AE century. The ?Babylonian and Persian nucleous? take place and contacts newly with and when the ?preAshenazim second fase? were migrating to the East Europe. A remarkable contact was with the fourth ?East Europe Jews nucleous?not related or little related with ME, with the descendant of the Jews Khazarians ones, spreading every where and carrying a lot of East Europe and Eurasian markers. That happen between the XI and XII century AE.
    The Tuscan host populations come from Anatolia like infers mtDNA markers, and others, yet present today ? a thread Etruscan link – and are so common in South East Basin like Albanian, Grecian, Tunisian and Anatolian , as well as the entirely Italy and some South France spots, practical absent in central or North Europa or East Asia. If we compare Ashkenazy jews with this South European Tuscan population will see a more European genes pool coming from Europe than if we compare with Central and North European population.

    Dr Hector H. Otero C.
    Argentina.

    See too:
    http://www.nature.com/news/2010/100603/full/news.2010.277.html
    See, my comment 11149 and 12952 in the above Html, with table 1 partial, remember that the sibling Hg "M" and "N" from "L3" too correspond to East African origin, and are not included at all-see complete table 1, in reference-.

Subscribe to comments

Additional data