Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Multi-ancestry polygenic mechanisms of type 2 diabetes

Abstract

Type 2 diabetes (T2D) is a multifactorial disease with substantial genetic risk, for which the underlying biological mechanisms are not fully understood. In this study, we identified multi-ancestry T2D genetic clusters by analyzing genetic data from diverse populations in 37 published T2D genome-wide association studies representing more than 1.4 million individuals. We implemented soft clustering with 650 T2D-associated genetic variants and 110 T2D-related traits, capturing known and novel T2D clusters with distinct cardiometabolic trait associations across two independent biobanks representing diverse genetic ancestral populations (African, n = 21,906; Admixed American, n = 14,410; East Asian, n =2,422; European, n = 90,093; and South Asian, n = 1,262). The 12 genetic clusters were enriched for specific single-cell regulatory regions. Several of the polygenic scores derived from the clusters differed in distribution among ancestry groups, including a significantly higher proportion of lipodystrophy-related polygenic risk in East Asian ancestry. T2D risk was equivalent at a body mass index (BMI) of 30 kg m2 in the European subpopulation and 24.2 (22.9–25.5) kg m2 in the East Asian subpopulation; after adjusting for cluster-specific genetic risk, the equivalent BMI threshold increased to 28.5 (27.1–30.0) kg m2 in the East Asian group. Thus, these multi-ancestry T2D genetic clusters encompass a broader range of biological mechanisms and provide preliminary insights to explain ancestry-associated differences in T2D risk profiles.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Key loci and traits of multi-ancestry T2D genetic clusters.
Fig. 2: Multi-ancestry T2D genetic cluster associations with continuous traits and clinical phenotypes.
Fig. 3: Enrichment for cell-type-specific enhancers in multi-ancestry T2D clusters.
Fig. 4: Ancestry-specific relationship among T2D genetic clusters, BMI and T2D risk.

Similar content being viewed by others

Data availability

All referenced GWAS summary statistics are publicly available and are cited in Supplementary Tables 1, 3 and 9. Researchers can apply to access individual-level data in the All of Us program (https://researchallofus.org/). Individual-level data in the MGB Biobank are available only with approval from the MGB IRB. Databases of epigenomic activity are available online for CATLAS (https://catlas.org) and Roadmap (https://egg2.wustl.edu/roadmap/).

Code availability

Code for variant pre-processing, bNMF clustering and basic visualizations is available at https://github.com/gwas-partitioning/bnmf-clustering.

References

  1. Tobias, D. K. et al. Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine. Nat. Med. 29, 2438–2457 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Misra, S. et al. Precision subclassification of type 2 diabetes: a systematic review. Commun. Med. (Lond.) 3, 138 (2023).

    Article  PubMed  Google Scholar 

  3. Udler, M. S. et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis. PLoS Med. 15, e1002654 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Kim, H. et al. High-throughput genetic clustering of type 2 diabetes loci reveals heterogeneous mechanistic pathways of metabolic disease. Diabetologia 66, 495–507 (2023).

    Article  CAS  PubMed  Google Scholar 

  5. Mahajan, A. et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes. Nat. Genet. 50, 559–571 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Suzuki, K. et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature https://doi.org/10.1038/s41586-024-07019-6 (2024).

  7. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Vujkovic, M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 52, 680–691 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Spracklen, C. N. et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 582, 240–245 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. BasuRay, S., Wang, Y., Smagris, E., Cohen, J. C. & Hobbs, H. H. Accumulation of PNPLA3 on lipid droplets is the basis of associated hepatic steatosis. Proc. Natl Acad. Sci. USA 116, 9521–9526 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Lee, S. M., Muratalla, J., Sierra-Cruz, M. & Cordoba-Chacon, J. Role of hepatic peroxisome proliferator-activated receptor γ in non-alcoholic fatty liver disease. J. Endocrinol. 257, e220155 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Getz, G. S. & Reardon, C. A. Apoprotein E and reverse cholesterol transport. Int. J. Mol. Sci. 19, 3479 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  PubMed Central  Google Scholar 

  16. Caleyachetty, R. et al. Ethnicity-specific BMI cutoffs for obesity based on type 2 diabetes risk in England: a population-based cohort study. Lancet Diabetes Endocrinol. 9, 419–426 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Yaghootkar, H., Whitcher, B., Bell, J. D. & Thomas, E. L. Ethnic differences in adiposity and diabetes risk—insights from genetic studies. J. Intern. Med. 288, 271–283 (2020).

    Article  CAS  PubMed  Google Scholar 

  18. Ntuk, U. E., Gill, J. M. R., Mackay, D. F., Sattar, N. & Pell, J. P. Ethnic-specific obesity cutoffs for diabetes risk: cross-sectional study of 490,288 UK Biobank participants. Diabetes Care 37, 2500–2507 (2014).

    Article  PubMed  Google Scholar 

  19. Hsu, W. C., Araneta, M. R. G., Kanaya, A. M., Chiang, J. L. & Fujimoto, W. BMI cut points to identify at-risk Asian Americans for type 2 diabetes screening. Diabetes Care 38, 150–158 (2015).

    Article  PubMed  Google Scholar 

  20. Rodriguez, L. A. et al. Examining if the relationship between BMI and incident type 2 diabetes among middle-older aged adults varies by race/ethnicity: evidence from the Multi-Ethnic Study of Atherosclerosis (MESA). Diabet. Med. 38, e14377 (2021).

    Article  CAS  PubMed  Google Scholar 

  21. Aggarwal, R. et al. Diabetes screening by race and ethnicity in the United States: equivalent body mass index and age thresholds. Ann. Intern. Med. 175, 765–773 (2022).

    Article  PubMed  Google Scholar 

  22. WHO Expert Consultation. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 363, 157–163 (2004).

    Article  Google Scholar 

  23. Inker, L. A. et al. New creatinine- and cystatin C-based equations to estimate GFR without race. N. Engl. J. Med. 385, 1737–1749 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Zhao, W. et al. Identification of new susceptibility loci for type 2 diabetes and shared etiological pathways with coronary heart disease. Nat. Genet. 49, 1450–1457 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Goodarzi, M. O. & Rotter, J. I. Genetics insights in the relationship between type 2 diabetes and coronary heart disease. Circ. Res. 126, 1526–1548 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).

    Article  CAS  PubMed  Google Scholar 

  27. González-Lleó, A. M., Sánchez-Hernández, R. M., Boronat, M. & Wägner, A. M. Diabetes and familial hypercholesterolemia: interplay between lipid and glucose metabolism. Nutrients 14, 1503 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Wei, Y. et al. Associations between serum total bilirubin, obesity and type 2 diabetes. Diabetol. Metab. Syndr. 13, 143 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Hansen, M. et al. Bile acid sequestrants for glycemic control in patients with type 2 diabetes: a systematic review with meta-analysis of randomized controlled trials. J. Diabetes Complications 31, 918–927 (2017).

    Article  PubMed  Google Scholar 

  30. Glunk, V. et al. A non-coding variant linked to metabolic obesity with normal weight affects actin remodelling in subcutaneous adipocytes. Nat. Metab. 5, 861–879 (2023).

    Article  CAS  PubMed  Google Scholar 

  31. Fathzadeh, M. et al. FAM13A affects body fat distribution and adipocyte function. Nat. Commun. 11, 1465 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Li, B.-T. et al. Disruption of the ERLIN–TM6SF2–APOB complex destabilizes APOB and contributes to non-alcoholic fatty liver disease. PLoS Genet. 16, e1008955 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. ElSayed, N. A. et al. 2. Classification and diagnosis of diabetes: Standards of Care in Diabetes—2023. Diabetes Care 46, S19–S40 (2023).

    Article  CAS  PubMed  Google Scholar 

  34. Vyas, D. A., Eisenstein, L. G. & Jones, D. S. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383, 874–882 (2020).

    Article  PubMed  Google Scholar 

  35. Narayan, K. M. V. et al. Incidence and pathophysiology of diabetes in South Asian adults living in India and Pakistan compared with US blacks and whites. BMJ Open Diabetes Res. Care 9, e001927 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Narayan, K. M. V. & Kanaya, A. M. Why are South Asians prone to type 2 diabetes? A hypothesis based on underexplored pathways. Diabetologia 63, 1103–1109 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  37. All of Us Research Program Investigators et al. The ‘All of Us’ Research Program. N. Engl. J. Med. 381, 668–676 (2019).

    Article  Google Scholar 

  38. Castro, V. M. et al. The mass general brigham biobank portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics. J. Am. Med. Inform. Assoc. 29, 643–651 (2022).

    Article  PubMed  Google Scholar 

  39. Kho, A. N. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. J. Am. Med. Inform. Assoc. 19, 212–218 (2012).

    Article  PubMed  Google Scholar 

  40. Szczerbinski, L. et al. Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores—a new resource for diabetes precision medicine. Preprint at bioRxiv https://doi.org/10.1101/2023.09.05.23295061 (2023).

  41. Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Crawford, S. L. Correlation and regression. Circulation 114, 2083–2088 (2006).

    Article  PubMed  Google Scholar 

  43. DiCorpo, D. et al. Type 2 diabetes partitioned polygenic scores associate with disease outcomes in 454,193 individuals across 13 cohorts. Diabetes Care 45, 674–683 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Patel, A. P. et al. Association of rare pathogenic DNA variants for familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and lynch syndrome with disease risk in adults according to family history. JAMA Netw. Open 3, e203959 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Magudia, K. et al. Population-scale CT-based body composition analysis of a large outpatient population using deep learning to derive age-, sex-, and race-specific reference curves. Radiology 298, 319–329 (2021).

    Article  PubMed  Google Scholar 

  46. Bridge, C. P. et al. A fully automated deep learning pipeline for multi-vertebral level quantification and characterization of muscle and adipose tissue on chest CT scans. Radio. Artif. Intell. 4, e210080 (2022).

    Article  Google Scholar 

Download references

Acknowledgements

A.J.D. is supported by National Institues of Health (NIH)/National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) T32 DK007028 and NIH/NIDDK F32 DK137487. K.E.W. is supported by NIH K01DK133637. L.S. is supported by funds from the Ministry of Education and Science of Poland within the ‘Excellence Initiative—Research University’ project, by the Ministry of Health of Poland within the ‘Center of Artificial Intelligence in Medicine at the Medical University of Bialystok’ project and by American Diabetes Association grant 11-22-PDFPM-03. M.C. is supported by the Novo Nordisk Foundation (NNF21SA0072102) and NIDDK UM1 DK126185. J.M.M. is supported by American Diabetes Association Innovative and Clinical Translational Award 1-19-ICTS-068, American Diabetes Association grant 11-22-ICTSPM-16 and National Human Genome Research Institute U01HG011723. M.S.U. is supported by NIDDK K23DK114551, NIDDK R03DK131249 and Doris Duke Foundation Award 2022063. The authors thank the participants of the All of Us Research Program and MGB Biobank. In addition, we thank the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) for providing pre-publication access to GWAS summary statistics for post-challenge insulin resistance measures. Finally, we also thank J. Dupuis for assistance with statistical analysis.

Author information

Authors and Affiliations

Authors

Contributions

Conceived and designed the study: K.S., A.J.D. and M.S.U. Conducted analysis: K.S., A.J.D., C.M. and A.H.-C. Curated data: K.S., A.J.D., H.K., S.H., R.M., P.H.S., K.E.W., L.S., T.D.M., V.K., A.W., A.K.M. and J.M.M. Provided feedback on analysis: N.Z., M.C., J.C.F., A.K.M., J.M.M., K.J.G. and M.S.U. Wrote the initial manuscript draft: K.S., A.J.D. and M.S.U. All authors approved the final version of the manuscript.

Corresponding author

Correspondence to Miriam S. Udler.

Ethics declarations

Competing interests

T.D.M. currently works for Vertex Pharmaceuticals. K.S., M.C., J.C.F., J.M.M. and M.S.U. are currently part of a collaboration project between the Broad Institute and Novo Nordisk. M.S.U. is an unfunded collaborator with Nightingale and AstraZeneca. None of the other authors declare competing interests.

Peer review

Peer review information

Nature Medicine thanks Constantin Polychronakos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Anna Maria Ranzoni, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overview of high-throughput bNMF pipeline for multi-ancestry (MA) clusters.

Flowchart of the steps implemented in the clustering pipeline. Steps include: 1) extract variants from diverse set of T2D GWAS datasets, 2) apply LD-pruning across reference panels for all populations included, to ensure independent genetic signals, 3) find proxy variants for variants that are multi-allelic, ambiguous, or have low trait counts, 4) align variants to risk increasing alleles in largest MA T2D GWAS and remove if their P value in this GWAS does not meet a Bonferroni threshold, 5) filter trait GWAS by minimum sample size, 6) filter trait GWAS by a minimum Bonferroni-corrected P value across the selected variants, 7) filter by correlation between traits and 8) generate the variant by trait association matrix which serves as the bNMF input.

Extended Data Fig. 2 The multi-ancestry clusters recapture several key pathways that were identified in our previous papers.

Correlation heatmap for trait cluster weights from the multi-ancestry clusters versus the (a) Udler et al. clusters3 and (b) Kim et al. clusters4. Correlation coefficients are displayed for trait pairs where R > 0.

Extended Data Fig. 3 Common T2D genetic clusters are shared across individual ancestry groups.

Heatmaps display the correlation of the trait cluster weights in the European, East Asian, African and Admixed American clusters versus the (a) multi-ancestry and (b) Udler et al. clusters3.

Extended Data Fig. 4 Sex-stratified association of multi-ancestry T2D genetic clusters with anthropometric traits.

Each dot displays the sex-stratified association between selected multi-ancestry T2D cluster pPS and selected anthropometric traits. (a) Visceral adipose tissue [VAT], subcutaneous adipose tissue [SAT], and VAT/SAT ratio, measured in a subset of approximately 9,000 MGB Biobank participants with available data. (b) Waist circumference, hip circumference, and waist/hip ratio, measured in the All of Us Cohort. Each outcome was normalized to a standard normal distribution (for all participants, females only, or males only). Each dot indicates the effect per one standard deviation increase in the pPS. Error bars denote the standard error. P values were obtained from two-sided t tests, and asterisks indicate P < 0.05. Complete statistics (including exact P values and the number of individuals measured for each outcome) are provided in Supplementary Table 13.

Extended Data Fig. 5 Variation in distribution of multi-ancestry T2D genetic clusters across ancestry groups.

Each histogram displays the distribution of the pPS for the indicated multi-ancestry T2D genetic cluster. For each cluster, the pPS for the entire cohort was normalized to a standard normal distribution, and a separate curve is displayed for each genetically inferred ancestry group. All analyses were performed in a meta-analysis of All of Us and MGB Biobank. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian.

Extended Data Fig. 6 Proportion of total T2D genetic risk attributable to each multi-ancestry T2D cluster.

For each individual, the total T2D genetic risk was calculated as the sum of the pPS across each of the 12 multi-ancestry T2D genetic clusters. All individuals were then grouped according to genetically inferred ancestry. Each bar displays the proportion of the total T2D genetic risk conferred by each specific cluster. This graph represents a meta-analysis of All of Us and MGB Biobank. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian.

Extended Data Fig. 7 Validation of relationship between T2D genetic clusters, BMI, and T2D risk.

(a, b) Relationship between BMI and T2D risk, classified by genetic ancestry. T2D risk was controlled for age, sex, BMI, and genetic ancestry group. (A) T2D risk in all participants (B) T2D risk in a subset of participants with similar Lipodystrophy 1 pPS (ranging from 0.5–1.5, representing 22.8% of the overall population). Analyses represent a meta-analysis of MGB Biobank and All of Us. (c) Relationship between BMI and T2D risk, assessed within EUR ancestry participants in All of Us. Covariates include age, sex, BMI, and 10 principal components. T2D risk was calculated separately for individuals in the top and bottom deciles of the Lipodystrophy 1 pPS. In all panels, the horizontal dashed line is chosen at a representative level to indicate the outcome at a BMI of 30 kg/m2 in the EUR ancestry group.

Extended Data Fig. 8 Comparison of cluster-specific risk allele frequencies (RAF) in EUR and other ancestry groups.

We compared the T2D risk allele frequencies between the EUR ancestry group and each of the other groups (AFR, AMR, EAS, and SAS). For each multi-ancestry cluster, we plotted the allele frequency for cluster-defining variants (those above the bNMF weight cutoff). In both the Lipodystrophy 1 and Lipodystrophy 2 clusters, the majority of variants had higher risk allele frequency in EAS compared to EUR ancestry, consistent with our observation that the pPS distributions for these two clusters were shifted to the right in EAS ancestry.

Extended Data Fig. 9 Ancestry-specific variation in adipose volume and triglycerides.

(a, b) Relationship between VAT/SAT ratio and T2D risk, classified by genetic ancestry. (A) Unadjusted T2D risk, controlling for age, sex, BMI, and genetic ancestry group. (B) Adjusted T2D risk, after additionally controlling for Lipodystrophy 1 pPS and Lipodystrophy 2 pPS. Analyses represent a subset of ~9,000 MGB Biobank participants with available VAT/SAT data. (c,d) Relationship between BMI and triglyceride levels, classified by genetic ancestry. (C) Unadjusted triglyceride level, controlling for age, sex, BMI, and genetic ancestry group. (D) Adjusted triglyceride level, after additionally controlling for Lipodystrophy 1 pPS and Lipodystrophy 2 pPS. Analyses represent a meta-analysis of MGB Biobank and All of Us. (e) Relationship between BMI and triglyceride levels, assessed within EUR ancestry participants in All of Us. Covariates include age, sex, BMI, and 10 principal components. Triglyceride levels were calculated separately for individuals in the top and bottom deciles of the Lipodystrophy 1 pPS. In all panels, the horizontal dashed line is chosen at a representative level to indicate the outcome at a BMI of 30 kg/m2 in the EUR ancestry group.

Extended Data Fig. 10 Conservation of biological pathways between the multi-ancestry and T2DGGI clusters.

Variants in the T2DGGI clusters (NSNP = 1,289) were assigned a proxy variant from the multi-ancestry variant set (NSNP = 650) based on linkage disequilibrium values (r2 > 0.5). The clusters were then cross-examined using a two-sided Wilcoxon rank-sum test based on the multi-ancestry bNMF variant weights.

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary data workbook

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Smith, K., Deutsch, A.J., McGrail, C. et al. Multi-ancestry polygenic mechanisms of type 2 diabetes. Nat Med 30, 1065–1074 (2024). https://doi.org/10.1038/s41591-024-02865-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-024-02865-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing