Abstract
Type 2 diabetes (T2D) is a multifactorial disease with substantial genetic risk, for which the underlying biological mechanisms are not fully understood. In this study, we identified multi-ancestry T2D genetic clusters by analyzing genetic data from diverse populations in 37 published T2D genome-wide association studies representing more than 1.4 million individuals. We implemented soft clustering with 650 T2D-associated genetic variants and 110 T2D-related traits, capturing known and novel T2D clusters with distinct cardiometabolic trait associations across two independent biobanks representing diverse genetic ancestral populations (African, n = 21,906; Admixed American, n = 14,410; East Asian, n =2,422; European, n = 90,093; and South Asian, n = 1,262). The 12 genetic clusters were enriched for specific single-cell regulatory regions. Several of the polygenic scores derived from the clusters differed in distribution among ancestry groups, including a significantly higher proportion of lipodystrophy-related polygenic risk in East Asian ancestry. T2D risk was equivalent at a body mass index (BMI) of 30 kg m−2 in the European subpopulation and 24.2 (22.9–25.5) kg m−2 in the East Asian subpopulation; after adjusting for cluster-specific genetic risk, the equivalent BMI threshold increased to 28.5 (27.1–30.0) kg m−2 in the East Asian group. Thus, these multi-ancestry T2D genetic clusters encompass a broader range of biological mechanisms and provide preliminary insights to explain ancestry-associated differences in T2D risk profiles.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All referenced GWAS summary statistics are publicly available and are cited in Supplementary Tables 1, 3 and 9. Researchers can apply to access individual-level data in the All of Us program (https://researchallofus.org/). Individual-level data in the MGB Biobank are available only with approval from the MGB IRB. Databases of epigenomic activity are available online for CATLAS (https://catlas.org) and Roadmap (https://egg2.wustl.edu/roadmap/).
Code availability
Code for variant pre-processing, bNMF clustering and basic visualizations is available at https://github.com/gwas-partitioning/bnmf-clustering.
Change history
17 May 2024
A Correction to this paper has been published: https://doi.org/10.1038/s41591-024-03066-8
References
Tobias, D. K. et al. Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine. Nat. Med. 29, 2438–2457 (2023).
Misra, S. et al. Precision subclassification of type 2 diabetes: a systematic review. Commun. Med. (Lond.) 3, 138 (2023).
Udler, M. S. et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis. PLoS Med. 15, e1002654 (2018).
Kim, H. et al. High-throughput genetic clustering of type 2 diabetes loci reveals heterogeneous mechanistic pathways of metabolic disease. Diabetologia 66, 495–507 (2023).
Mahajan, A. et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes. Nat. Genet. 50, 559–571 (2018).
Suzuki, K. et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature https://doi.org/10.1038/s41586-024-07019-6 (2024).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
Vujkovic, M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 52, 680–691 (2020).
Spracklen, C. N. et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 582, 240–245 (2020).
BasuRay, S., Wang, Y., Smagris, E., Cohen, J. C. & Hobbs, H. H. Accumulation of PNPLA3 on lipid droplets is the basis of associated hepatic steatosis. Proc. Natl Acad. Sci. USA 116, 9521–9526 (2019).
Lee, S. M., Muratalla, J., Sierra-Cruz, M. & Cordoba-Chacon, J. Role of hepatic peroxisome proliferator-activated receptor γ in non-alcoholic fatty liver disease. J. Endocrinol. 257, e220155 (2023).
Getz, G. S. & Reardon, C. A. Apoprotein E and reverse cholesterol transport. Int. J. Mol. Sci. 19, 3479 (2018).
Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001 (2021).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Caleyachetty, R. et al. Ethnicity-specific BMI cutoffs for obesity based on type 2 diabetes risk in England: a population-based cohort study. Lancet Diabetes Endocrinol. 9, 419–426 (2021).
Yaghootkar, H., Whitcher, B., Bell, J. D. & Thomas, E. L. Ethnic differences in adiposity and diabetes risk—insights from genetic studies. J. Intern. Med. 288, 271–283 (2020).
Ntuk, U. E., Gill, J. M. R., Mackay, D. F., Sattar, N. & Pell, J. P. Ethnic-specific obesity cutoffs for diabetes risk: cross-sectional study of 490,288 UK Biobank participants. Diabetes Care 37, 2500–2507 (2014).
Hsu, W. C., Araneta, M. R. G., Kanaya, A. M., Chiang, J. L. & Fujimoto, W. BMI cut points to identify at-risk Asian Americans for type 2 diabetes screening. Diabetes Care 38, 150–158 (2015).
Rodriguez, L. A. et al. Examining if the relationship between BMI and incident type 2 diabetes among middle-older aged adults varies by race/ethnicity: evidence from the Multi-Ethnic Study of Atherosclerosis (MESA). Diabet. Med. 38, e14377 (2021).
Aggarwal, R. et al. Diabetes screening by race and ethnicity in the United States: equivalent body mass index and age thresholds. Ann. Intern. Med. 175, 765–773 (2022).
WHO Expert Consultation. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 363, 157–163 (2004).
Inker, L. A. et al. New creatinine- and cystatin C-based equations to estimate GFR without race. N. Engl. J. Med. 385, 1737–1749 (2021).
Zhao, W. et al. Identification of new susceptibility loci for type 2 diabetes and shared etiological pathways with coronary heart disease. Nat. Genet. 49, 1450–1457 (2017).
Goodarzi, M. O. & Rotter, J. I. Genetics insights in the relationship between type 2 diabetes and coronary heart disease. Circ. Res. 126, 1526–1548 (2020).
Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).
González-Lleó, A. M., Sánchez-Hernández, R. M., Boronat, M. & Wägner, A. M. Diabetes and familial hypercholesterolemia: interplay between lipid and glucose metabolism. Nutrients 14, 1503 (2022).
Wei, Y. et al. Associations between serum total bilirubin, obesity and type 2 diabetes. Diabetol. Metab. Syndr. 13, 143 (2021).
Hansen, M. et al. Bile acid sequestrants for glycemic control in patients with type 2 diabetes: a systematic review with meta-analysis of randomized controlled trials. J. Diabetes Complications 31, 918–927 (2017).
Glunk, V. et al. A non-coding variant linked to metabolic obesity with normal weight affects actin remodelling in subcutaneous adipocytes. Nat. Metab. 5, 861–879 (2023).
Fathzadeh, M. et al. FAM13A affects body fat distribution and adipocyte function. Nat. Commun. 11, 1465 (2020).
Li, B.-T. et al. Disruption of the ERLIN–TM6SF2–APOB complex destabilizes APOB and contributes to non-alcoholic fatty liver disease. PLoS Genet. 16, e1008955 (2020).
ElSayed, N. A. et al. 2. Classification and diagnosis of diabetes: Standards of Care in Diabetes—2023. Diabetes Care 46, S19–S40 (2023).
Vyas, D. A., Eisenstein, L. G. & Jones, D. S. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383, 874–882 (2020).
Narayan, K. M. V. et al. Incidence and pathophysiology of diabetes in South Asian adults living in India and Pakistan compared with US blacks and whites. BMJ Open Diabetes Res. Care 9, e001927 (2021).
Narayan, K. M. V. & Kanaya, A. M. Why are South Asians prone to type 2 diabetes? A hypothesis based on underexplored pathways. Diabetologia 63, 1103–1109 (2020).
All of Us Research Program Investigators et al. The ‘All of Us’ Research Program. N. Engl. J. Med. 381, 668–676 (2019).
Castro, V. M. et al. The mass general brigham biobank portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics. J. Am. Med. Inform. Assoc. 29, 643–651 (2022).
Kho, A. N. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. J. Am. Med. Inform. Assoc. 19, 212–218 (2012).
Szczerbinski, L. et al. Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores—a new resource for diabetes precision medicine. Preprint at bioRxiv https://doi.org/10.1101/2023.09.05.23295061 (2023).
Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).
Crawford, S. L. Correlation and regression. Circulation 114, 2083–2088 (2006).
DiCorpo, D. et al. Type 2 diabetes partitioned polygenic scores associate with disease outcomes in 454,193 individuals across 13 cohorts. Diabetes Care 45, 674–683 (2022).
Patel, A. P. et al. Association of rare pathogenic DNA variants for familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and lynch syndrome with disease risk in adults according to family history. JAMA Netw. Open 3, e203959 (2020).
Magudia, K. et al. Population-scale CT-based body composition analysis of a large outpatient population using deep learning to derive age-, sex-, and race-specific reference curves. Radiology 298, 319–329 (2021).
Bridge, C. P. et al. A fully automated deep learning pipeline for multi-vertebral level quantification and characterization of muscle and adipose tissue on chest CT scans. Radio. Artif. Intell. 4, e210080 (2022).
Acknowledgements
A.J.D. is supported by National Institues of Health (NIH)/National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) T32 DK007028 and NIH/NIDDK F32 DK137487. K.E.W. is supported by NIH K01DK133637. L.S. is supported by funds from the Ministry of Education and Science of Poland within the ‘Excellence Initiative—Research University’ project, by the Ministry of Health of Poland within the ‘Center of Artificial Intelligence in Medicine at the Medical University of Bialystok’ project and by American Diabetes Association grant 11-22-PDFPM-03. M.C. is supported by the Novo Nordisk Foundation (NNF21SA0072102) and NIDDK UM1 DK126185. J.M.M. is supported by American Diabetes Association Innovative and Clinical Translational Award 1-19-ICTS-068, American Diabetes Association grant 11-22-ICTSPM-16 and National Human Genome Research Institute U01HG011723. M.S.U. is supported by NIDDK K23DK114551, NIDDK R03DK131249 and Doris Duke Foundation Award 2022063. The authors thank the participants of the All of Us Research Program and MGB Biobank. In addition, we thank the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) for providing pre-publication access to GWAS summary statistics for post-challenge insulin resistance measures. Finally, we also thank J. Dupuis for assistance with statistical analysis.
Author information
Authors and Affiliations
Contributions
Conceived and designed the study: K.S., A.J.D. and M.S.U. Conducted analysis: K.S., A.J.D., C.M. and A.H.-C. Curated data: K.S., A.J.D., H.K., S.H., R.M., P.H.S., K.E.W., L.S., T.D.M., V.K., A.W., A.K.M. and J.M.M. Provided feedback on analysis: N.Z., M.C., J.C.F., A.K.M., J.M.M., K.J.G. and M.S.U. Wrote the initial manuscript draft: K.S., A.J.D. and M.S.U. All authors approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
T.D.M. currently works for Vertex Pharmaceuticals. K.S., M.C., J.C.F., J.M.M. and M.S.U. are currently part of a collaboration project between the Broad Institute and Novo Nordisk. M.S.U. is an unfunded collaborator with Nightingale and AstraZeneca. None of the other authors declare competing interests.
Peer review
Peer review information
Nature Medicine thanks Constantin Polychronakos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Anna Maria Ranzoni, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Overview of high-throughput bNMF pipeline for multi-ancestry (MA) clusters.
Flowchart of the steps implemented in the clustering pipeline. Steps include: 1) extract variants from diverse set of T2D GWAS datasets, 2) apply LD-pruning across reference panels for all populations included, to ensure independent genetic signals, 3) find proxy variants for variants that are multi-allelic, ambiguous, or have low trait counts, 4) align variants to risk increasing alleles in largest MA T2D GWAS and remove if their P value in this GWAS does not meet a Bonferroni threshold, 5) filter trait GWAS by minimum sample size, 6) filter trait GWAS by a minimum Bonferroni-corrected P value across the selected variants, 7) filter by correlation between traits and 8) generate the variant by trait association matrix which serves as the bNMF input.
Extended Data Fig. 3 Common T2D genetic clusters are shared across individual ancestry groups.
Heatmaps display the correlation of the trait cluster weights in the European, East Asian, African and Admixed American clusters versus the (a) multi-ancestry and (b) Udler et al. clusters3.
Extended Data Fig. 4 Sex-stratified association of multi-ancestry T2D genetic clusters with anthropometric traits.
Each dot displays the sex-stratified association between selected multi-ancestry T2D cluster pPS and selected anthropometric traits. (a) Visceral adipose tissue [VAT], subcutaneous adipose tissue [SAT], and VAT/SAT ratio, measured in a subset of approximately 9,000 MGB Biobank participants with available data. (b) Waist circumference, hip circumference, and waist/hip ratio, measured in the All of Us Cohort. Each outcome was normalized to a standard normal distribution (for all participants, females only, or males only). Each dot indicates the effect per one standard deviation increase in the pPS. Error bars denote the standard error. P values were obtained from two-sided t tests, and asterisks indicate P < 0.05. Complete statistics (including exact P values and the number of individuals measured for each outcome) are provided in Supplementary Table 13.
Extended Data Fig. 5 Variation in distribution of multi-ancestry T2D genetic clusters across ancestry groups.
Each histogram displays the distribution of the pPS for the indicated multi-ancestry T2D genetic cluster. For each cluster, the pPS for the entire cohort was normalized to a standard normal distribution, and a separate curve is displayed for each genetically inferred ancestry group. All analyses were performed in a meta-analysis of All of Us and MGB Biobank. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian.
Extended Data Fig. 6 Proportion of total T2D genetic risk attributable to each multi-ancestry T2D cluster.
For each individual, the total T2D genetic risk was calculated as the sum of the pPS across each of the 12 multi-ancestry T2D genetic clusters. All individuals were then grouped according to genetically inferred ancestry. Each bar displays the proportion of the total T2D genetic risk conferred by each specific cluster. This graph represents a meta-analysis of All of Us and MGB Biobank. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian.
Extended Data Fig. 7 Validation of relationship between T2D genetic clusters, BMI, and T2D risk.
(a, b) Relationship between BMI and T2D risk, classified by genetic ancestry. T2D risk was controlled for age, sex, BMI, and genetic ancestry group. (A) T2D risk in all participants (B) T2D risk in a subset of participants with similar Lipodystrophy 1 pPS (ranging from 0.5–1.5, representing 22.8% of the overall population). Analyses represent a meta-analysis of MGB Biobank and All of Us. (c) Relationship between BMI and T2D risk, assessed within EUR ancestry participants in All of Us. Covariates include age, sex, BMI, and 10 principal components. T2D risk was calculated separately for individuals in the top and bottom deciles of the Lipodystrophy 1 pPS. In all panels, the horizontal dashed line is chosen at a representative level to indicate the outcome at a BMI of 30 kg/m2 in the EUR ancestry group.
Extended Data Fig. 8 Comparison of cluster-specific risk allele frequencies (RAF) in EUR and other ancestry groups.
We compared the T2D risk allele frequencies between the EUR ancestry group and each of the other groups (AFR, AMR, EAS, and SAS). For each multi-ancestry cluster, we plotted the allele frequency for cluster-defining variants (those above the bNMF weight cutoff). In both the Lipodystrophy 1 and Lipodystrophy 2 clusters, the majority of variants had higher risk allele frequency in EAS compared to EUR ancestry, consistent with our observation that the pPS distributions for these two clusters were shifted to the right in EAS ancestry.
Extended Data Fig. 9 Ancestry-specific variation in adipose volume and triglycerides.
(a, b) Relationship between VAT/SAT ratio and T2D risk, classified by genetic ancestry. (A) Unadjusted T2D risk, controlling for age, sex, BMI, and genetic ancestry group. (B) Adjusted T2D risk, after additionally controlling for Lipodystrophy 1 pPS and Lipodystrophy 2 pPS. Analyses represent a subset of ~9,000 MGB Biobank participants with available VAT/SAT data. (c,d) Relationship between BMI and triglyceride levels, classified by genetic ancestry. (C) Unadjusted triglyceride level, controlling for age, sex, BMI, and genetic ancestry group. (D) Adjusted triglyceride level, after additionally controlling for Lipodystrophy 1 pPS and Lipodystrophy 2 pPS. Analyses represent a meta-analysis of MGB Biobank and All of Us. (e) Relationship between BMI and triglyceride levels, assessed within EUR ancestry participants in All of Us. Covariates include age, sex, BMI, and 10 principal components. Triglyceride levels were calculated separately for individuals in the top and bottom deciles of the Lipodystrophy 1 pPS. In all panels, the horizontal dashed line is chosen at a representative level to indicate the outcome at a BMI of 30 kg/m2 in the EUR ancestry group.
Extended Data Fig. 10 Conservation of biological pathways between the multi-ancestry and T2DGGI clusters.
Variants in the T2DGGI clusters (NSNP = 1,289) were assigned a proxy variant from the multi-ancestry variant set (NSNP = 650) based on linkage disequilibrium values (r2 > 0.5). The clusters were then cross-examined using a two-sided Wilcoxon rank-sum test based on the multi-ancestry bNMF variant weights.
Supplementary information
Supplementary Tables
Supplementary data workbook
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Smith, K., Deutsch, A.J., McGrail, C. et al. Multi-ancestry polygenic mechanisms of type 2 diabetes. Nat Med 30, 1065–1074 (2024). https://doi.org/10.1038/s41591-024-02865-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-024-02865-3