Abstract
The more than 1.5 billion people who live in South Asia are correctly viewed not as a single large population but as many small endogamous groups. We assembled genome-wide data from over 2,800 individuals from over 260 distinct South Asian groups. We identified 81 unique groups, 14 of which had estimated census sizes of more than 1 million, that descend from founder events more extreme than those in Ashkenazi Jews and Finns, both of which have high rates of recessive disease due to founder events. We identified multiple examples of recessive diseases in South Asia that are the result of such founder events. This study highlights an underappreciated opportunity for decreasing disease burden among South Asians through discovery of and testing for recessive disease-associated genes.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Reanalyzing the genetic history of Kra-Dai speakers from Thailand and new insights into their genetic interactions beyond Mainland Southeast Asia
Scientific Reports Open Access 24 May 2023
-
Entwined African and Asian genetic roots of medieval peoples of the Swahili coast
Nature Open Access 29 March 2023
-
Ancient DNA from Protohistoric Period Cambodia indicates that South Asians admixed with local populations as early as 1st–3rd centuries CE
Scientific Reports Open Access 29 December 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout



References
Mastana, S.S. Unity in diversity: an overview of the genomic anthropology of India. Ann. Hum. Biol. 41, 287–299 (2014).
Bamshad, M.J. et al. Female gene flow stratifies Hindu castes. Nature 395, 651–652 (1998).
Basu, A. et al. Ethnic India: a genomic view, with special reference to peopling and structure. Genome Res. 13, 2277–2290 (2003).
Reich, D., Thangaraj, K., Patterson, N., Price, A.L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).
Lim, E.T. et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet. 10, e1004494 (2014).
Arcos-Burgos, M. & Muenke, M. Genetics of population isolates. Clin. Genet. 61, 233–247 (2002).
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Moorjani, P. et al. Genetic evidence for recent population mixture in India. Am. J. Hum. Genet. 93, 422–438 (2013).
Metspalu, M. et al. Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. Am. J. Hum. Genet. 89, 731–744 (2011).
Behar, D.M. et al. The genome-wide structure of the Jewish people. Nature 466, 238–242 (2010).
Basu, A., Sarkar-Roy, N. & Majumder, P.P. Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proc. Natl. Acad. Sci. USA 113, 1594–1599 (2016).
1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Manoharan, I., Wieseler, S., Layer, P.G., Lockridge, O. & Boopathy, R. Naturally occurring mutation Leu307Pro of human butyrylcholinesterase in the Vysya community of India. Pharmacogenet. Genomics 16, 461–468 (2006).
Shukla, A. et al. Homozygous p.(Glu87Lys) variant in ISCA1 is associated with a new multiple mitochondrial dysfunctions syndrome. J. Hum. Genet. 62, 723–727 (2017).
Dalal, A. et al. Analysis of the WISP3 gene in Indian families with progressive pseudorheumatoid dysplasia. Am. J. Med. Genet. A. 158A, 2820–2828 (2012).
Bhavani, G.S. et al. Novel and recurrent mutations in WISP3 and an atypical phenotype. Am. J. Med. Genet. A. 167A, 2481–2484 (2015).
Raz, A.E. Can population-based carrier screening be left to the community? J. Genet. Couns. 18, 114–118 (2009).
Rajasimha, H.K. et al. Organization for rare diseases India (ORDI): addressing the challenges and opportunities for the Indian rare diseases' community. Genet. Res. (Camb.) 96, e009 (2014).
Sudmant, P.H. et al. Global diversity, population stratification, and selection of human copy-number variation. Science 349, aab3761 (2015).
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
Sudmant, P.H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).
Patterson, N., Price, A.L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Chang, C.C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Gusev, A. et al. Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19, 318–326 (2009).
Hoaglin, D.C. & Iglewicz, B. How to Detect and Handle Outliers (ASQC Quality Press, 1993).
Palamara, P.F. ARGON: fast, whole-genome simulation of the discrete time Wright-fisher process. Bioinformatics 32, 3032–3034 (2016).
Browning, S.R. & Browning, B.L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
Durand, E.Y., Eriksson, N. & McLean, C.Y. Reducing pervasive false-positive identical-by-descent segments detected by large-scale pedigree analysis. Mol. Biol. Evol. 31, 2212–2222 (2014).
Browning, B.L. & Browning, S.R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).
Bidchol, A.M. et al. GALNS mutations in Indian patients with mucopolysaccharidosis IVA. Am. J. Med. Genet. A. 164A, 2793–2801 (2014).
Acknowledgements
We are thankful to the many Indian, Pakistani, Bangladeshi, Sri Lankan, and Nepalese individuals who contributed the DNA samples analyzed here, including the patients with PPD and MPS IVA. We are grateful to A. Basu and P. P. Majumder (National Institute of Biomedical Genomics, Kalyani, India) for early sharing of data. Funding was provided by an NIGMS (GM007753) fellowship to N.N.; a Translational Seed Fund grant from the Dean's Office of Harvard Medical School, and grant HG006399 to D.R.; a Council of Scientific and Industrial Research, Government of India grant to K.T.; support from TIFAC-CORE to S.A.V. and K.S.; and NIGMS grant 115006 to P.M. The study of MPS IVA patients was funded by grants from the Department of Biotechnology, Government of India (BT/PR4224/MED/97/60/2011 to S.S. and S.M.J.) and the Department of Science and Technology, Government of India (SR/WOS-A/LS-83/2011 to S.S.). Funding for the mutation analysis of Indian patients with PPD was provided by the Indian Council of Medical Research (BMS 54/2/2013) to K.M.G. D.R. is supported as an Investigator of the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
N.N., P.M., D.R., and K.T. conceived the study. N.N., P.M., N.R., B.S., A.T., N.P., and D.R. performed analysis. G.S.B., K.M.G., M.S.M., S.S., A.K., S.A.V., S.M.J., K.S., L.S., and K.T. collected data. N.N., D.R., and K.T. wrote the manuscript with the help of all coauthors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–6, Supplementary Tables 1–4 and 6, and Supplementary Note. (PDF 8052 kb)
Supplementary Table 5
IBD, FST, and group-specific drift analyses. (XLSX 114 kb)
Rights and permissions
About this article
Cite this article
Nakatsuka, N., Moorjani, P., Rai, N. et al. The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet 49, 1403–1407 (2017). https://doi.org/10.1038/ng.3917
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3917
This article is cited by
-
Whole-genome sequencing of 1029 Indian individuals reveals unique and rare structural variants
Journal of Human Genetics (2023)
-
Reanalyzing the genetic history of Kra-Dai speakers from Thailand and new insights into their genetic interactions beyond Mainland Southeast Asia
Scientific Reports (2023)
-
Clustering of Juvenile Canavan disease in an Indian community due to population bottleneck and isolation: genomic signatures of a founder event
European Journal of Human Genetics (2023)
-
Entwined African and Asian genetic roots of medieval peoples of the Swahili coast
Nature (2023)
-
1029 genomes of self-declared healthy individuals from India reveal prevalent and clinically relevant cardiac ion channelopathy variants
Human Genomics (2022)