The promise of discovering population-specific disease-associated genes in South Asia

Published online:


The more than 1.5 billion people who live in South Asia are correctly viewed not as a single large population but as many small endogamous groups. We assembled genome-wide data from over 2,800 individuals from over 260 distinct South Asian groups. We identified 81 unique groups, 14 of which had estimated census sizes of more than 1 million, that descend from founder events more extreme than those in Ashkenazi Jews and Finns, both of which have high rates of recessive disease due to founder events. We identified multiple examples of recessive diseases in South Asia that are the result of such founder events. This study highlights an underappreciated opportunity for decreasing disease burden among South Asians through discovery of and testing for recessive disease-associated genes.

  • Subscribe to Nature Genetics for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.


  1. 1.

    Unity in diversity: an overview of the genomic anthropology of India. Ann. Hum. Biol. 41, 287–299 (2014).

  2. 2.

    et al. Female gene flow stratifies Hindu castes. Nature 395, 651–652 (1998).

  3. 3.

    et al. Ethnic India: a genomic view, with special reference to peopling and structure. Genome Res. 13, 2277–2290 (2003).

  4. 4.

    , , , & Reconstructing Indian population history. Nature 461, 489–494 (2009).

  5. 5.

    et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet. 10, e1004494 (2014).

  6. 6.

    & Genetics of population isolates. Clin. Genet. 61, 233–247 (2002).

  7. 7.

    et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

  8. 8.

    et al. Genetic evidence for recent population mixture in India. Am. J. Hum. Genet. 93, 422–438 (2013).

  9. 9.

    et al. Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. Am. J. Hum. Genet. 89, 731–744 (2011).

  10. 10.

    et al. The genome-wide structure of the Jewish people. Nature 466, 238–242 (2010).

  11. 11.

    , & Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proc. Natl. Acad. Sci. USA 113, 1594–1599 (2016).

  12. 12.

    1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  13. 13.

    , , , & Naturally occurring mutation Leu307Pro of human butyrylcholinesterase in the Vysya community of India. Pharmacogenet. Genomics 16, 461–468 (2006).

  14. 14.

    et al. Homozygous p.(Glu87Lys) variant in ISCA1 is associated with a new multiple mitochondrial dysfunctions syndrome. J. Hum. Genet. 62, 723–727 (2017).

  15. 15.

    et al. Analysis of the WISP3 gene in Indian families with progressive pseudorheumatoid dysplasia. Am. J. Med. Genet. A. 158A, 2820–2828 (2012).

  16. 16.

    et al. Novel and recurrent mutations in WISP3 and an atypical phenotype. Am. J. Med. Genet. A. 167A, 2481–2484 (2015).

  17. 17.

    Can population-based carrier screening be left to the community? J. Genet. Couns. 18, 114–118 (2009).

  18. 18.

    et al. Organization for rare diseases India (ORDI): addressing the challenges and opportunities for the Indian rare diseases' community. Genet. Res. (Camb.) 96, e009 (2014).

  19. 19.

    et al. Global diversity, population stratification, and selection of human copy-number variation. Science 349, aab3761 (2015).

  20. 20.

    et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

  21. 21.

    et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

  22. 22.

    , & Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).

  23. 23.

    et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

  24. 24.

    et al. Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19, 318–326 (2009).

  25. 25.

    & How to Detect and Handle Outliers (ASQC Quality Press, 1993).

  26. 26.

    ARGON: fast, whole-genome simulation of the discrete time Wright-fisher process. Bioinformatics 32, 3032–3034 (2016).

  27. 27.

    & Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

  28. 28.

    , & Reducing pervasive false-positive identical-by-descent segments detected by large-scale pedigree analysis. Mol. Biol. Evol. 31, 2212–2222 (2014).

  29. 29.

    & Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).

  30. 30.

    et al. GALNS mutations in Indian patients with mucopolysaccharidosis IVA. Am. J. Med. Genet. A. 164A, 2793–2801 (2014).

Download references


We are thankful to the many Indian, Pakistani, Bangladeshi, Sri Lankan, and Nepalese individuals who contributed the DNA samples analyzed here, including the patients with PPD and MPS IVA. We are grateful to A. Basu and P. P. Majumder (National Institute of Biomedical Genomics, Kalyani, India) for early sharing of data. Funding was provided by an NIGMS (GM007753) fellowship to N.N.; a Translational Seed Fund grant from the Dean's Office of Harvard Medical School, and grant HG006399 to D.R.; a Council of Scientific and Industrial Research, Government of India grant to K.T.; support from TIFAC-CORE to S.A.V. and K.S.; and NIGMS grant 115006 to P.M. The study of MPS IVA patients was funded by grants from the Department of Biotechnology, Government of India (BT/PR4224/MED/97/60/2011 to S.S. and S.M.J.) and the Department of Science and Technology, Government of India (SR/WOS-A/LS-83/2011 to S.S.). Funding for the mutation analysis of Indian patients with PPD was provided by the Indian Council of Medical Research (BMS 54/2/2013) to K.M.G. D.R. is supported as an Investigator of the Howard Hughes Medical Institute.

Author information

Author notes

    • Niraj Rai

    Present address: Birbal Sahni Institute of Palaeosciences, Lucknow, India.

    • David Reich
    •  & Kumarasamy Thangaraj

    These authors jointly directed this work.


  1. Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

    • Nathan Nakatsuka
    • , Arti Tandon
    •  & David Reich
  2. Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, Massachusetts, USA.

    • Nathan Nakatsuka
  3. Department of Biological Sciences, Columbia University, New York, New York, USA.

    • Priya Moorjani
  4. Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.

    • Priya Moorjani
    • , Arti Tandon
    • , Nick Patterson
    •  & David Reich
  5. CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India.

    • Niraj Rai
    •  & Kumarasamy Thangaraj
  6. Anthropological Survey of India, Kolkata, India.

    • Biswanath Sarkar
  7. Department of Medical Genetics, Kasturba Medical College, Manipal University, Manipal, India.

    • Gandham SriLakshmi Bhavani
    •  & Katta Mohan Girisha
  8. Department of Applied Zoology, Mangalore University, Mangalore, India.

    • Mohammed S Mustak
  9. Centre for Human Genetics, Bangalore, India.

    • Sudha Srinivasan
  10. Amity Institute of Biotechnology, Amity University, Noida, India.

    • Amit Kaushik
  11. School of Life Sciences, Manipal University, Manipal, India.

    • Saadi Abdul Vahab
    •  & Kapaettu Satyamoorthy
  12. Fetal Care Research Foundation, Chennai, India.

    • Sujatha M Jagadeesh
  13. Genome Foundation, Hyderabad, India.

    • Lalji Singh
  14. Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts, USA.

    • David Reich


  1. Search for Nathan Nakatsuka in:

  2. Search for Priya Moorjani in:

  3. Search for Niraj Rai in:

  4. Search for Biswanath Sarkar in:

  5. Search for Arti Tandon in:

  6. Search for Nick Patterson in:

  7. Search for Gandham SriLakshmi Bhavani in:

  8. Search for Katta Mohan Girisha in:

  9. Search for Mohammed S Mustak in:

  10. Search for Sudha Srinivasan in:

  11. Search for Amit Kaushik in:

  12. Search for Saadi Abdul Vahab in:

  13. Search for Sujatha M Jagadeesh in:

  14. Search for Kapaettu Satyamoorthy in:

  15. Search for Lalji Singh in:

  16. Search for David Reich in:

  17. Search for Kumarasamy Thangaraj in:


N.N., P.M., D.R., and K.T. conceived the study. N.N., P.M., N.R., B.S., A.T., N.P., and D.R. performed analysis. G.S.B., K.M.G., M.S.M., S.S., A.K., S.A.V., S.M.J., K.S., L.S., and K.T. collected data. N.N., D.R., and K.T. wrote the manuscript with the help of all coauthors.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to David Reich or Kumarasamy Thangaraj.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–6, Supplementary Tables 1–4 and 6, and Supplementary Note.

  2. 2.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Table 5

    IBD, FST, and group-specific drift analyses.