The genetic architecture of type 2 diabetes

Journal name:
Nature
Volume:
536,
Pages:
41–47
Date published:
DOI:
doi:10.1038/nature18642
Received
Accepted
Published online

Abstract

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.

At a glance

Figures

  1. Ascertainment of variants and single-variant results.
    Figure 1: Ascertainment of variants and single-variant results.

    a, Sensitivity of low-coverage genome sequence data to detect SNVs in the deep exome sequence data, relative to other variant catalogues. Points represent results for a specific minor allele count. All results assume odds ratio (OR) = 1 for all variants, unless stated otherwise. b, c, Manhattan plots of single-variant association analyses for: sequence data alone (b, 1,326 cases and 1,331 controls) and meta-analysis of sequence and imputed data (c, total of 14,297 cases and 32,774 controls). 1000G, the 1000 Genomes Project data.

  2. Association between T2D and variants in genes for Mendelian forms of diabetes.
    Figure 2: Association between T2D and variants in genes for Mendelian forms of diabetes.

    a, P values (plotted as −log10 P) of aggregate association for variants from 6,504 T2D cases and 6,436 controls in three sets of Mendelian diabetes genes, for five variant ‘masks’ (see Methods). Dotted line, P = 0.05. b, Estimated T2D odds ratio (OR) for carriers of variants in each gene-set and mask. c, Estimated ORs (bars, left axis) and P values (dots, right axis) for carriers of variants in the PTV + NSstrict mask for each gene. Red, OR > 1; blue, OR < 1; dotted line, P = 0.05. Error bars represent s.e.

  3. Empirical T2D association results compared to results under different simulated disease models.
    Figure 3: Empirical T2D association results compared to results under different simulated disease models.

    Observed number of rare and low-frequency (MAF <5%) genetic association signals for T2D detected genome-wide after imputation compared to the numbers seen under three simulated disease models for T2D which were plausible given results (T2D recurrence risks, GWAS, linkage) before large-scale sequencing. Simulated models were defined by two parameters: disease target size T and degree of coupling τ between the causal effects of variants and the selective pressure against them40. Simulated data were generated to match GoT2D imputation quality as a function of MAF (see Methods). Error bars represent s.e. observed across simulation replicates (bar value shows the mean).

  4. Summary of samples and quality control procedures.
    Extended Data Fig. 1: Summary of samples and quality control procedures.

    This figure summarizes data generation for whole-genome sequencing (GoT2D), exome sequencing (GoT2D and T2D-GENES), exome array genotyping (DIAGRAM), and GWAS imputation (DIAGRAM).

  5. Power for single and aggregate variant association.
    Extended Data Fig. 2: Power for single and aggregate variant association.

    a–g, Power to detect single-variant association (α = 5 × 10−8) at varying minor allele frequencies (x-axis) and allelic ORs (y-axis) for seven effective sample size (Neff) scenarios relevant to the genomes (ac) and exomes (dg) components of this project. a, Variant observed in 2,657 samples (the effective size of the GoT2D integrated panel). b, Variant observed in 28,350 samples (the effective size of the imputed data set). c, Variant observed in the GoT2D integrated panel and the imputed data set (effective sample size 31,007). d, Ancestry-specific variant in 2,000 samples (the size of each of the non-European exome sequence data sets). e, European-specific variant in 5,000 samples (the combined size of the European exome sequence data sets). f, Variant observed with shared frequency across all ancestry groups in 12,940 samples (the size of the combined exome sequence data set). g, Variant observed in the combined exome array and sequencing data set (effective sample size 82,758). h, i, Power for gene-based test of association (SKAT-O) according to liability variance explained. In h, 50% of the variants contribute to disease risk and the remaining 50% have no effect on disease risk; in i, 100% of the variants contribute to disease risk. For each, sample sizes considered are 2,000 (ancestry-specific effects; green) and 12,940 (ancestry-shared effects; blue). Power is shown for two levels of significance (α = 2.5 × 10−6 and α = 0.001). From these simulation studies, it is clear that under the optimistic model, where effects are shared across all ethnicities (blue line) and all variants contribute, power is >60% for 1% variance explained and α = 2.5 × 10−6. However, power declines rapidly if either criterion is relaxed.

  6. Single variant analyses.
    Extended Data Fig. 3: Single variant analyses.

    ac, Manhattan plot of single-variant analyses generated from exome sequence data in 6,504 cases and 6,436 controls of African American, East Asian, European, Hispanic, and South Asian ancestry (a); exome array genotypes in 28,305 cases and 51,549 controls of European ancestry (b); and combined meta-analysis of exome array and exome sequence samples (c). Coding variants are categorized according to their relationships to the previously reported lead variant from GWAS region. Loci achieving genome-wide significance only in the combined analysis are highlighted in bold. The HNF1A variant reaching genome-wide significance in the combined analysis is a synonymous variant (Thr515Thr). The dashed horizontal line in each panel designates the threshold for genome-wide significance (P < 5 × 10−8).

  7. Classification of coding variants according to their relationship to reported lead variants for each GWAS region.
    Extended Data Fig. 4: Classification of coding variants according to their relationship to reported lead variants for each GWAS region.

    The ideogram shows the location of 25 coding variant associations at 16 loci described in the text. The number in each circle corresponds to the number of associated variants at each locus. Variants are grouped into five categories based on inferred relationship with the GWAS lead variant. For some of these categories, the figure includes representative regional association plots based on exome array meta-analysis data from 28,305 cases and 51,549 controls. The locus displayed for each category is designated in bold. The first plot in each panel shows the unconditional association results; the middle plot the association results after conditioning on the non-coding GWAS SNP; and the last plot the results after conditioning on the most significantly associated coding variant. Each point represents an SNP in the exome array meta-analysis, plotted with its P value (on a –log10 scale) as a function of the genomic position (hg19). In each panel, the lead coding variant is represented by the purple symbol. The colour-coding of all other SNPs indicates LD with the lead SNP (estimated by European r2 from 1000G March 2012 reference panel: red r2 ≥ 0.8; gold 0.6 ≤ r2 < 0.8; green 0.4 ≤ r2 < 0.6; cyan 0.2 ≤ r2 < 0.4; blue r2 < 0.2; grey r2 unknown). Gene annotations are taken from the University of California Santa Cruz genome browser. GWS: genome-wide significance. *Seven variants, three at ASCC2, and one each at THADA, TSPAN8, FES and HNF4A did not achieve genome-wide significance themselves, but are included because they fall into genes and/or regions with other significant association signals (see text).

  8. Exclusion of synthetic associations and construction of credible causal variant sets at T2D GWAS loci.
    Extended Data Fig. 5: Exclusion of synthetic associations and construction of credible causal variant sets at T2D GWAS loci.

    Ten T2D GWAS loci were selected for synthetic association testing (P < 0.001; see Methods). a, The effect size observed at the GWAS index SNV (sequence data) before (navy blue) and after (light blue, grey) conditioning on candidate rare and low-frequency (MAF <5%) variants which could produce synthetic association. b, Example of synthetic association exclusion at the TCF7L2 locus. Error bars represent 95% confidence intervals for the index SNP odds ratio as rare variants are greedily added to the model. c, The size of credible sets at T2D GWAS loci when constructed from the GoT2D data, compared to the sizes when restricted to variants in the 1000G or HapMap data.

  9. Genome enrichment analysis in GoT2D whole genome sequence data.
    Extended Data Fig. 6: Genome enrichment analysis in GoT2D whole genome sequence data.

    n = 2,657. a, Functional annotation categories were defined using transcription, chromatin state and transcription factor binding data from GENCODE, ENCODE and other studies. b, T2D association statistics for variants at each T2D locus were jointly modelled with functional annotation using fgwas. In the resulting model we identified enrichment of coding exons (CDS), transcription factor binding sites (TFBS), mature adipose active enhancers and promoters (hASC-t4 EnhA, TssA), pancreatic islet active and weak enhancers (HI EnhA, EnhWk), pre-adipose active and weak enhancers (hASC-t1 EnhA, EnhWk), embryonic stem cell active promoters (H1-hESC TssA) and 5′UTRs. Dots represent enrichment estimates and horizontal lines the 95% confidence intervals. c, At the CCND2 locus, three variants not present in HapMap2 have a combined 90% posterior probability of being causal (rs4238013, rs3217801, rs73040004). One of these variants, rs3217801, is a 2-bp indel that overlaps an islet enhancer element.

  10. Low frequency variants in exome array data.
    Extended Data Fig. 7: Low frequency variants in exome array data.

    Results from meta-analysis of 43,045 low-frequency and common coding variants on the exome array (assayed in 79,854 European subjects). a, Observed allelic ORs as a property of allele MAF. Variants missing in more than eight cohorts or polymorphic in only one cohort were excluded. Coloured lines represent contours for liability variance explained. Regions shaded grey denote ranges of OR and MAF consistent with 80% power (in this case, at α = 5 × 10−7) to detect single-variant associations in this data set (given the observed range of missing data). Variants with a black collar are those highlighted by a bounding analysis as having a probability >0.8 of having liability-scale variance (LVE) > 0.1%. b, Distribution of each variant in the MAF/OR space was computed by assuming T2D prevalence of 8% and a beta and normal distribution for MAF and OR, respectively. Probability is obtained by integrating the joint MAF–OR distributions over ranges of LVE. c, Single variant association, liability and bounding results for the known T2D GWAS variants on the exome array (see Methods).

Tables

  1. Summary information for sample sets used in the association analyses
    Extended Data Table 1: Summary information for sample sets used in the association analyses
  2. Counts and properties of variants identified in sequenced subjects
    Extended Data Table 2: Counts and properties of variants identified in sequenced subjects
  3. Characterization of variant associations through conditional analysis
    Extended Data Table 3: Characterization of variant associations through conditional analysis
  4. Testing for synthetic associations across GWAS-identified T2D loci
    Extended Data Table 4: Testing for synthetic associations across GWAS-identified T2D loci

References

  1. Willemsen, G. et al. The concordance and heritability of type 2 diabetes in 34,166 twin pairs from international twin registers: the discordant twin (DISCOTWIN) consortium. Twin Res. Hum. Genet. 18, 762771 (2015)
  2. Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981990 (2012)
  3. Mahajan, A. et al. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat. Genet. 46, 234244 (2014)
  4. Voight, B. F. et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579589 (2010)
  5. Kooner, J. S. et al. Genome-wide association study in individuals of South Asian ancestry identifies six new type 2 diabetes susceptibility loci. Nat. Genet. 43, 984989 (2011)
  6. Cho, Y. S. et al. Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east Asians. Nat. Genet. 44, 6772 (2011)
  7. Steinthorsdottir, V. et al. Identification of low-frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nat. Genet. 46, 294298 (2014)
  8. Ma, R. C. et al. Genome-wide association study in a Chinese population identifies a susceptibility locus for type 2 diabetes at 7q32 near PAX4. Diabetologia 56, 12911305 (2013)
  9. Huyghe, J. R. et al. Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nat. Genet. 45, 197201 (2013)
  10. Gaulton, K. J. et al. Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci. Nat. Genet. 47, 14151425 (2015)
  11. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747753 (2009)
  12. Lohmueller, K. E. et al. Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am. J. Hum. Genet. 93, 10721086 (2013)
  13. Albrechtsen, A. et al. Exome sequencing-driven discovery of coding polymorphisms associated with common metabolic phenotypes. Diabetologia 56, 298310 (2013)
  14. Claussnitzer, M. et al. Leveraging cross-species transcription factor binding site patterns: from diabetes risk loci to disease mechanisms. Cell 156, 343358 (2014)
  15. Lee, S., Teslovich, T. M., Boehnke, M. & Lin, X. General framework for meta-analysis of rare variants in sequencing association studies. Am. J. Hum. Genet. 93, 4253 (2013)
  16. Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348354 (2010)
  17. Collombat, P. et al. Opposing actions of Arx and Pax4 in endocrine pancreas development. Genes Dev. 17, 25912603 (2003)
  18. Kooptiwut, S. et al. Defective PAX4 R192H transcriptional repressor activities associated with maturity onset diabetes of the young and early onset-age of type 2 diabetes. J. Diabetes Complications 26, 343347 (2012)
  19. Langenberg, C. et al. Design and cohort description of the InterAct Project: an examination of the interaction of genetic and lifestyle factors on the incidence of type 2 diabetes in the EPIC Study. Diabetologia 54, 22722282 (2011)
  20. Oppelt, A. et al. Production of phosphatidylinositol 5-phosphate via PIKfyve and MTMR3 regulates cell migration. EMBO Rep. 14, 5764 (2013)
  21. Kozlitina, J. et al. Exome-wide association study identifies a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 46, 352356 (2014)
  22. Mahdessian, H. et al. TM6SF2 is a regulator of liver fat metabolism influencing triglyceride secretion and hepatic lipid droplet content. Proc. Natl Acad. Sci. USA 111, 89138918 (2014)
  23. Thiagalingam, A., Lengauer, C., Baylin, S. B. & Nelkin, B. D. RREB1, a ras responsive element binding protein, maps to human chromosome 6p25. Genomics 45, 630632 (1997)
  24. Murphy, R., Ellard, S. & Hattersley, A. T. Clinical implications of a molecular genetic classification of monogenic β-cell diabetes. Nat. Clin. Pract. Endocrinol. Metab. 4, 200213 (2008)
  25. Dickson, S. P., Wang, K., Krantz, I., Hakonarson, H. & Goldstein, D. B. Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010)
  26. Anderson, C. A., Soranzo, N., Zeggini, E. & Barrett, J. C. Synthetic associations are unlikely to account for many common disease genome-wide association signals. PLoS Biol. 9, e1000580 (2011)
  27. Wray, N. R., Purcell, S. M. & Visscher, P. M. Synthetic associations created by rare variants do not explain most GWAS results. PLoS Biol. 9, e1000579 (2011)
  28. Sim, X. et al. Transferability of type 2 diabetes implicated loci in multi-ethnic cohorts from Southeast Asia. PLoS Genet. 7, e1001363 (2011)
  29. Goldstein, D. B. The importance of synthetic associations will only be resolved empirically. PLoS Biol. 9, e1001008 (2011)
  30. Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208227 (2007)
  31. Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 12941301 (2012)
  32. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774 (2012)
  33. Mikkelsen, T. S. et al. Comparative epigenomic analysis of murine and human adipogenesis. Cell 143, 156169 (2010)
  34. Parker, S. C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl Acad. Sci. USA 110, 1792117926 (2013)
  35. Pasquali, L. et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat. Genet. 46, 136143 (2014)
  36. Gaulton, K. J. et al. A map of open chromatin in human pancreatic islets. Nat. Genet. 42, 255259 (2010)
  37. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 11901195 (2012)
  38. Pickrell, J. K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559573 (2014)
  39. Falconer, D. S. The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann. Hum. Genet. 29, 5176 (1965)
  40. Agarwala, V., Flannick, J. & Sunyaev, S., GoT2D Consortium & Altshuler, D. Evaluating empirical bounds on complex disease genetic architecture. Nat. Genet. 45, 14181427 (2013)
  41. McClellan, J. & King, M. C. Genetic heterogeneity in human disease. Cell 141, 210217 (2010)
  42. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565569 (2010)
  43. Flannick, J. et al. Loss-of-function mutations in SLC30A8 protect against type 2 diabetes. Nat. Genet. 46, 357363 (2014)
  44. Bonnefond, A. et al. Rare MTNR1B variants impairing melatonin receptor 1B function contribute to type 2 diabetes. Nat. Genet. 44, 297301 (2012)
  45. Sigma Type 2 Diabetes Consortium et al. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico. Nature 506, 97101 (2014)
  46. Moltke, I. et al. A common Greenlandic TBC1D4 variant confers muscle insulin resistance and type 2 diabetes. Nature 512, 190193 (2014)
  47. Sigma Type 2 Diabetes Consortium et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. JAMA 311, 23052314 (2014)
  48. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 8084 (2014)
  49. Majithia, A. R. et al. Rare variants in PPARG with decreased activity in adipocyte differentiation are associated with increased risk of type 2 diabetes. Proc. Natl Acad. Sci. USA 111, 1312713132 (2014)
  50. Guey, L. T. et al. Power in the phenotypic extremes: a simulation study of power in discovery and replication of rare variants. Genet. Epidemiol. 35, 236246 (2011)
  51. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009)
  52. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491498 (2011)
  53. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 12971303 (2010)
  54. Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839848 (2012)
  55. Abecasis, G. R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665 (2012)
  56. Handsaker, R. E., Korn, J. M., Nemesh, J. & McCarroll, S. A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269276 (2011)
  57. Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 10841097 (2007)
  58. Li, Y., Sidore, C., Kang, H. M., Boehnke, M. & Abecasis, G. R. Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 21, 940951 (2011)
  59. Price, A. L. et al. Long-range LD can confound genome scans in admixed populations. Am. J. Hum. Genet. 83, 132135, author reply 135–139 (2008)
  60. Weale, M. E. Quality control for genome-wide association studies. Methods Mol. Biol. 628, 341372 (2010)
  61. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661678 (2007)
  62. Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904909 (2006)
  63. Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: faster genotype imputation. Bioinformatics 31, 782784 (2015)
  64. Firth, D. Bias reduction of maximum-likelihood-estimates. Biometrika 80, 2738 (1993)
  65. Ma, C., Blackwell, T., Boehnke, M. & Scott, L. J. Recommended joint and meta-analysis strategies for case-control association testing of single low-count variants. Genet. Epidemiol. 37, 539550 (2013)
  66. Morris, A. P. Transethnic meta-analysis of genomewide association studies. Genet. Epidemiol. 35, 809822 (2011)
  67. Seldin, M. F., Pasaniuc, B. & Price, A. L. New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 12, 523528 (2011)
  68. Price, A. L. et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5, e1000519 (2009)
  69. Churchhouse, C. & Marchini, J. Multiway admixture deconvolution using phased or unphased ancestral panels. Genet. Epidemiol. 37, 112 (2013)
  70. Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185190 (2014)
  71. Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762775 (2012)
  72. Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906913 (2007)
  73. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 9971004 (1999)
  74. Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 21902191 (2010)
  75. Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 93629367 (2009)
  76. Korn, J. M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 12531260 (2008)
  77. Rice, W. R. A consensus combined P-value test and the family-wide significance of component tests. Biometrics 46, 303308 (1990)
  78. Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369375 (2012)
  79. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 7682 (2011)
  80. Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 17601774 (2012)
  81. Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817825 (2010)
  82. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 1554515550 (2005)
  83. Lage, K. et al. A human phenome–interactome network of protein complexes implicated in genetic disorders. Nat. Biotechnol. 25, 309316 (2007)
  84. Nepusz, T., Yu, H. & Paccanaro, A. Detecting overlapping protein complexes in protein–protein interaction networks. Nat. Methods 9, 471472 (2012)
  85. Jia, P., Zheng, S., Long, J., Zheng, W. & Zhao, Z. dmGWAS: dense module searching for genome-wide association studies in protein–protein interaction networks. Bioinformatics 27, 95102 (2011)
  86. Lambert, B. W., Terwilliger, J. D. & Weiss, K. M. ForSim: a tool for exploring the genetic architecture of complex traits with controlled truth. Bioinformatics 24, 18211822 (2008)
  87. Eyre-Walker, A. Evolution in health and medicine Sackler colloquium: Genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proc. Natl Acad. Sci. USA 107 (Suppl 1), 17521756 (2010)
  88. Lyssenko, V. et al. Clinical risk factors, DNA variants, and the development of type 2 diabetes. N. Engl. J. Med. 359, 22202232 (2008)

Download references

Author information

  1. These authors contributed equally to this work.

    • Christian Fuchsberger,
    • Jason Flannick,
    • Tanya M. Teslovich,
    • Anubha Mahajan,
    • Vineeta Agarwala &
    • Kyle J. Gaulton
  2. These authors jointly supervised this work.

    • Michael Boehnke,
    • David Altshuler &
    • Mark I. McCarthy
  3. Present addresses are provided in the Supplementary Information.

    • Clement Ma,
    • Pierre Fontanillas,
    • Loukas Moutsianas,
    • Xueling Sim,
    • Adam E. Locke,
    • Heather M. Highland,
    • Cecilia M. Lindgren,
    • Jeroen R. Huyghe,
    • Eric R. Gamazon,
    • Jinyan Huang,
    • Aaron G. Day-Williams,
    • Taylor J. Maxwell,
    • Rector Arya,
    • Marie Loh,
    • Farook Thameem,
    • Claes Ladenvall,
    • Lu Qi,
    • Kathleen Stirrups,
    • Mark DePristo,
    • Jaakko Tuomilehto,
    • Dwaipayan Bharadwaj,
    • Giriraj R. Chandak,
    • Erik Ingelsson,
    • Jong-Young Lee,
    • Nancy J. Cox &
    • David Altshuler
  4. Deceased.

    • Hanna E. Abboud

Affiliations

  1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA

    • Christian Fuchsberger,
    • Tanya M. Teslovich,
    • Clement Ma,
    • Xueling Sim,
    • Thomas W. Blackwell,
    • Adam E. Locke,
    • Anne U. Jackson,
    • Jeroen R. Huyghe,
    • Heather M. Stringham,
    • Keng-Han Lin,
    • Ryan P. Welch,
    • Phoenix Kwan,
    • Goo Jun,
    • Goncalo Abecasis,
    • Laura J. Scott,
    • Hyun Min Kang &
    • Michael Boehnke
  2. Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria

    • Christian Fuchsberger
  3. Center for Biomedicine, European Academy of Bolzano/Bozen (EURAC), affiliated with the University of Lübeck, Bolzano, Italy

    • Christian Fuchsberger
  4. Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA

    • Jason Flannick,
    • Vineeta Agarwala,
    • Pierre Fontanillas,
    • Cecilia M. Lindgren,
    • Christopher Hartl,
    • Todd Green,
    • Alisa Manning,
    • Jason Carey,
    • George Grant,
    • Benjamin M. Neale,
    • Shaun Purcell,
    • Tõnu Esko,
    • Mauricio O. Carneiro,
    • Jared Maguire,
    • Ryan Poplin,
    • Khalid Shakir,
    • Timothy Fennell,
    • Mark DePristo,
    • Jacquelyn Murphy,
    • Robert Onofrio,
    • Eric Banks,
    • Stacey Gabriel,
    • Noël P. Burtt,
    • Jose C. Florez &
    • David Altshuler
  5. Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, USA

    • Jason Flannick &
    • David Altshuler
  6. Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK

    • Anubha Mahajan,
    • Kyle J. Gaulton,
    • Loukas Moutsianas,
    • Davis J. McCarthy,
    • Manuel A. Rivas,
    • John R. B. Perry,
    • Neil R. Robertson,
    • N. William Rayner,
    • Juan Fernandez Tajes,
    • Cecilia M. Lindgren,
    • Martijn van de Bunt,
    • Richard D. Pearson,
    • Ashish Kumar,
    • Yuhui Chen,
    • Teresa Ferreira,
    • Momoko Horikoshi,
    • Erik Ingelsson,
    • Inga Prokopenko,
    • Anna L. Gloyn,
    • Peter Donnelly,
    • Gilean McVean,
    • Andrew P. Morris &
    • Mark I. McCarthy
  7. Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

    • Vineeta Agarwala
  8. Department of Statistics, University of Oxford, Oxford, UK

    • Davis J. McCarthy &
    • Peter Donnelly
  9. Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK

    • John R. B. Perry,
    • Dorota Pasko,
    • Andrew R. Wood &
    • Timothy M. Frayling
  10. MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK

    • John R. B. Perry,
    • Robert A. Scott,
    • Claudia Langenberg &
    • Nicholas J. Wareham
  11. Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK

    • John R. B. Perry,
    • Massimo Mangino,
    • Gabriela L. Surdulescu,
    • Dylan Hodgkiss,
    • Kerrin S. Small &
    • Timothy D. Spector
  12. Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK

    • Neil R. Robertson,
    • N. William Rayner,
    • Martijn van de Bunt,
    • Nicola L. Beer,
    • Momoko Horikoshi,
    • Jonathan C. Levy,
    • Christopher J. Groves,
    • Matt Neville,
    • Fredrik Karpe,
    • Inga Prokopenko,
    • Katharine R. Owen,
    • Anna L. Gloyn &
    • Mark I. McCarthy
  13. Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK

    • N. William Rayner,
    • Aaron G. Day-Williams,
    • John Danesh,
    • Kathleen Stirrups,
    • Manjinder Sandhu,
    • Inês Barroso &
    • Eleftheria Zeggini
  14. School of Computer Science, McGill University, Montreal, Quebec, Canada

    • Pablo Cingolani
  15. McGill University and Génome Québec Innovation Centre, Montreal, Quebec, Canada

    • Pablo Cingolani,
    • Yoshihiko Nagai &
    • Rob Sladek
  16. Human Genetics Center, The University of Texas Graduate School of Biomedical Sciences at Houston, The University of Texas Health Science Center at Houston, Houston, Texas, USA

    • Heather M. Highland
  17. Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA

    • Josee Dupuis,
    • Han Chen &
    • Denis Rybin
  18. National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts, USA

    • Josee Dupuis
  19. Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA

    • Peter S. Chines,
    • Lori L. Bonnycastle,
    • Narisu Narisu,
    • Amy Swift &
    • Francis S. Collins
  20. Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA

    • Han Chen &
    • Liming Liang
  21. Chronic Disease Epidemiology, Swiss Tropical and Public Health Institute, University of Basel, Basel, Switzerland

    • Ashish Kumar
  22. Institute of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

    • Martina Müller-Nurasyid,
    • Janina S. Ried,
    • Christian Gieger &
    • Konstantin Strauch
  23. Department of Medicine I, University Hospital Grosshadern, Ludwig-Maximilians-Universität, Munich, Germany

    • Martina Müller-Nurasyid
  24. Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany

    • Martina Müller-Nurasyid &
    • Konstantin Strauch
  25. DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany

    • Martina Müller-Nurasyid &
    • Annette Peters
  26. The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark

    • Niels Grarup,
    • Jette Bork-Jensen,
    • Mette Hollensted,
    • Johanne Marie Justesen,
    • Anette P. Gjesing,
    • Torben Hansen &
    • Oluf Pedersen
  27. Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, USA

    • Eric R. Gamazon,
    • Hae Kyung Im &
    • Nancy J. Cox
  28. Department of Statistics, Seoul National University, Seoul, South Korea

    • Jaehoon Lee,
    • Iksoo Huh,
    • Yongkang Kim,
    • Selyeong Lee &
    • Taesung Park
  29. Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, USA

    • Jennifer E. Below,
    • Taylor J. Maxwell,
    • Goo Jun &
    • Craig L. Hanis
  30. Saw Swee Hock School of Public Health, National University of Singapore, National University Health System, Singapore

    • Peng Chen,
    • Xu Wang,
    • Ching-Yu Cheng,
    • Chiea-Chuen Khor,
    • Wei Yen Lim,
    • Jianjun Liu,
    • Kee Seng Chia,
    • Yik Ying Teo &
    • E. Shyong Tai
  31. Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA

    • Jinyan Huang,
    • Frank B. Hu &
    • Liming Liang
  32. Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do, South Korea

    • Min Jin Go,
    • Bong-Jo Kim,
    • Young Jin Kim,
    • Juyoung Lee,
    • Bok-Ghee Han &
    • Jong-Young Lee
  33. The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA

    • Michael L. Stitzel
  34. Departments of Computational Medicine & Bioinformatics and Human Genetics, University of Michigan, Ann Arbor, Michigan, USA

    • Stephen C. J. Parker
  35. Department of Clinical Sciences, Lund University Diabetes Centre, Genetic and Molecular Epidemiology Unit, Lund University, Malmö, Sweden

    • Tibor V. Varga &
    • Paul W. Franks
  36. Department of Epidemiology, Colorado School of Public Health, University of Colorado, Aurora, Colorado, USA

    • Tasha Fingerlin
  37. Department of Endocrinology and Metabolism, Shanghai Diabetes Institute, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China

    • Cheng Hu &
    • Weiping Jia
  38. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore

    • Mohammad Kamran Ikram,
    • Tin Aung,
    • Ching-Yu Cheng,
    • Chiea-Chuen Khor &
    • Tien Yin Wong
  39. Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore

    • Mohammad Kamran Ikram,
    • Tin Aung,
    • Ching-Yu Cheng,
    • Chiea-Chuen Khor &
    • Tien Yin Wong
  40. The Eye Academic Clinical Programme, Duke-NUS Graduate Medical School, Singapore

    • Mohammad Kamran Ikram,
    • Tin Aung,
    • Ching-Yu Cheng &
    • Tien Yin Wong
  41. Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea

    • Min-Seok Kwon,
    • Joon Yoon &
    • Taesung Park
  42. Department of Human Genetics, McGill University, Montreal, Quebec, Canada

    • Yoshihiko Nagai &
    • Rob Sladek
  43. Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada

    • Yoshihiko Nagai
  44. Department of Epidemiology and Biostatistics, Imperial College London, London, UK

    • Weihua Zhang,
    • Uzma Afzal,
    • Benjamin Lehne,
    • Marie Loh,
    • William R. Scott,
    • Paul Elliott &
    • John C. Chambers
  45. Department of Cardiology, Ealing Hospital NHS Trust, Southall, Middlesex, UK

    • Weihua Zhang,
    • Sian-Tsung Tan,
    • Jaspal Singh Kooner &
    • John C. Chambers
  46. Departments of Medicine and Genetics, Albert Einstein College of Medicine, New York, USA

    • Nir Barzilai &
    • Gil Atzmon
  47. Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, Pennsylvania, USA

    • Benjamin F. Voight
  48. Department of Genetics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, Pennsylvania, USA

    • Benjamin F. Voight
  49. Department of Medicine, University of Texas Health Science Center, San Antonio, Texas, USA

    • Christopher P. Jenkinson,
    • Hanna E. Abboud,
    • Sharon P. Fowler,
    • Farook Thameem,
    • Donna M. Lehman &
    • Ralph A. DeFronzo
  50. Research, South Texas Veterans Health Care System, San Antonio, Texas, USA

    • Christopher P. Jenkinson
  51. Faculty of Health Sciences, Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland

    • Teemu Kuulasmaa,
    • Johanna Kuusisto,
    • Alena Stančáková &
    • Markku Laakso
  52. Kuopio University Hospital, Kuopio, Finland

    • Johanna Kuusisto &
    • Markku Laakso
  53. Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA

    • Maggie C. Y. Ng,
    • Nicholette D. Palmer,
    • Pamela J. Hicks &
    • Donald W. Bowden
  54. Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA

    • Maggie C. Y. Ng,
    • Nicholette D. Palmer,
    • Pamela J. Hicks &
    • Donald W. Bowden
  55. Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA

    • Nicholette D. Palmer,
    • Pamela J. Hicks &
    • Donald W. Bowden
  56. Centre for Research in Epidemiology and Population Health, Inserm U1018, Villejuif, France

    • Beverley Balkau
  57. German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany

    • Heiner Boeing
  58. Department of Public Health and Caring Sciences, Geriatrics, Uppsala University, Uppsala, Sweden

    • Vilmantas Giedraitis &
    • Lars Lannfelt
  59. Centre for Chronic Disease Control, New Delhi, India

    • Dorairaj Prabhakaran &
    • Shah B. Ebrahim
  60. The Charles Bronfman Institute for Personalized Medicine, The Icahn School of Medicine at Mount Sinai, New York, USA

    • Omri Gottesman,
    • Yingchang Lu,
    • Erwin P. Bottinger &
    • Ruth J. F. Loos
  61. National Heart and Lung Institute, Cardiovascular Sciences, Hammersmith Campus, Imperial College London, London, UK

    • James Scott,
    • Sian-Tsung Tan &
    • Jaspal Singh Kooner
  62. Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA

    • Joshua D. Smith
  63. Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA

    • Benjamin M. Neale &
    • Mark J. Daly
  64. Center for Human Genetic Research, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA

    • Benjamin M. Neale,
    • Shaun Purcell &
    • Jose C. Florez
  65. Department of Psychiatry, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, USA

    • Shaun Purcell
  66. Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

    • Adam S. Butterworth,
    • Joanna M. M. Howson,
    • John Danesh &
    • Manjinder Sandhu
  67. Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China

    • Heung Man Lee,
    • Vincent K. L. Lam,
    • Wing Yee So,
    • Claudia H. T. Tam,
    • Juliana C. N. Chan &
    • Ronald C. W. Ma
  68. Department of Internal Medicine, Seoul National University College of Medicine, Seoul, South Korea

    • Soo-Heon Kwak &
    • Kyong Soo Park
  69. Department of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA

    • Wei Zhao
  70. NIHR Blood and Transplant Research Unit in Donor Health and Genomics, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

    • John Danesh
  71. Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, and College of Medicine, Seoul National University, Seoul, South Korea

    • Kyong Soo Park
  72. Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA

    • Danish Saleheen
  73. Center for Non-Communicable Diseases, Karachi, Pakistan

    • Danish Saleheen
  74. Cardiovascular Division, Baylor College of Medicine, Houston, Texas, USA

    • David Aguilar
  75. Department of Pediatrics, University of Texas Health Science Center, San Antonio, Texas, USA

    • Rector Arya &
    • Daniel Esten Hale
  76. Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore

    • Edmund Chan &
    • E. Shyong Tai
  77. Department of Epidemiology, Murcia Regional Health Council, IMIB-Arrixaca, Murcia, Spain

    • Carmen Navarro
  78. CIBER Epidemiología y Salud Pública (CIBERESP), Universidad de Murcia, Murcia, Spain

    • Carmen Navarro
  79. Unit of Preventive Medicine and Public Health, School of Medicine, University of Murcia, Spain

    • Carmen Navarro
  80. Cancer Research and Prevention Institute (ISPO), Florence, Italy

    • Domenico Palli
  81. Department of Medicine, University of Mississippi Medical Center, Jackson, Mississippi, USA

    • Adolfo Correa &
    • Herman A. Taylor Jr.
  82. South Texas Diabetes and Obesity Institute, Regional Academic Health Center, University of Texas Rio Grande Valley, Brownsville, Texas, USA

    • Joanne E. Curran,
    • Satish Kumar &
    • John Blangero
  83. Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas, USA

    • Vidya S. Farook,
    • Sobha Puppala &
    • Ravindranath Duggirala
  84. Department of Internal Medicine, Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA

    • Barry I. Freedman
  85. Center of Biostatistics and Bioinformatics, University of Mississippi Medical Center, Jackson, Mississippi, USA

    • Michael Griswold
  86. Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Singapore

    • Chiea-Chuen Khor
  87. Division of Human Genetics, Genome Institute of Singapore, A*STAR, Singapore

    • Chiea-Chuen Khor &
    • Jianjun Liu
  88. CNRS-UMR8199, Lille University, Lille Pasteur Institute, Lille, France

    • Dorothée Thuillier,
    • Loïc Yengo &
    • Philippe Froguel
  89. Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, Netherlands

    • Yvonne T. van der Schouw
  90. Institute of Health Sciences, University of Oulu, Oulu, Finland

    • Marie Loh
  91. Translational Laboratory in Genetic Medicine (TLGM), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore

    • Marie Loh
  92. Jackson Heart Study, University of Mississippi Medical Center, Jackson, Mississippi, USA

    • Solomon K. Musani
  93. College of Public Services, Jackson State University, Jackson, Mississippi, USA

    • Gregory Wilson
  94. KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Bergen, Norway

    • Pål Rasmus Njølstad
  95. Department of Pediatrics, Haukeland University Hospital, Bergen, Norway

    • Pål Rasmus Njølstad
  96. Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

    • Thomas Schwarzmayr,
    • Thomas Wieland,
    • Tim M. Strom &
    • Thomas Meitinger
  97. Department of Clinical Sciences, Diabetes and Endocrinology, Lund University Diabetes Centre, Malmö, Sweden

    • João Fadista,
    • Jasmina Kravic,
    • Valeriya Lyssenko,
    • Claes Ladenvall,
    • Anders H. Rosengren &
    • Leif Groop
  98. Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany

    • Christian Herder &
    • Michael Roden
  99. German Center for Diabetes Research (DZD), Neuherberg, Germany

    • Christian Herder,
    • Jennifer Kriebel,
    • Michael Roden,
    • Martin Hrabé de Angelis,
    • Barbara Thorand,
    • Christa Meisinger,
    • Annette Peters,
    • Cornelia Huth &
    • Harald Grallert
  100. Institute of Regional Health Research, University of Southern Denmark, Odense, Denmark

    • Ivan Brandslund
  101. Department of Clinical Biochemistry, Vejle Hospital, Vejle, Denmark

    • Ivan Brandslund
  102. Department of Internal Medicine and Endocrinology, Vejle Hospital, Vejle, Denmark

    • Cramer Christensen
  103. Department of Health, National Institute for Health and Welfare, Helsinki, Finland

    • Heikki A. Koistinen,
    • Leena Kinnunen &
    • Jaakko Tuomilehto
  104. Abdominal Center: Endocrinology, University of Helsinki and Helsinki University Central Hospital, Helsinki, Finland

    • Heikki A. Koistinen,
    • Liisa Hakaste &
    • Tiinamaija Tuomi
  105. Minerva Foundation Institute for Medical Research, Helsinki, Finland

    • Heikki A. Koistinen
  106. Department of Medicine, University of Helsinki and Helsinki University Central Hospital, Helsinki, Finland

    • Heikki A. Koistinen
  107. Division of Cardiovascular and Diabetes Medicine, Medical Research Institute, Ninewells Hospital and Medical School, Dundee, UK

    • Alex S. F. Doney
  108. Estonian Genome Center, University of Tartu, Tartu, Estonia

    • Tõnu Esko,
    • Lili Milani,
    • Evelin Mihailov,
    • Andres Metspalu,
    • Reedik Mägi &
    • Andrew P. Morris
  109. Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA

    • Tõnu Esko &
    • David Altshuler
  110. Division of Endocrinology, Boston Children's Hospital, Boston, Massachusetts, USA

    • Tõnu Esko
  111. Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK

    • Andrew J. Farmer
  112. Folkhälsan Research Centre, Helsinki, Finland

    • Liisa Hakaste,
    • Bo Isomaa &
    • Tiinamaija Tuomi
  113. Research Programs Unit, Diabetes and Obesity, University of Helsinki, Helsinki, Finland

    • Liisa Hakaste &
    • Tiinamaija Tuomi
  114. Steno Diabetes Center, Gentofte, Denmark

    • Marit E. Jørgensen
  115. Research Centre for Prevention and Health, Capital Region of Denmark, Glostrup, Denmark

    • Torben Jørgensen &
    • Allan Linneberg
  116. Department of Public Health, Institute of Health Sciences, University of Copenhagen, Copenhagen, Denmark

    • Torben Jørgensen
  117. Faculty of Medicine, Aalborg University, Aalborg, Denmark

    • Torben Jørgensen
  118. Department of Primary Health Care, Vaasa Central Hospital, Vaasa, Finland

    • Annemari Käräjämäki
  119. Diabetes Center, Vaasa Health Care Center, Vaasa, Finland

    • Annemari Käräjämäki
  120. Institute of Epidemiology II, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

    • Jennifer Kriebel,
    • Barbara Thorand,
    • Christa Meisinger,
    • Annette Peters,
    • Cornelia Huth,
    • Harald Grallert &
    • Christian Gieger
  121. Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

    • Jennifer Kriebel,
    • Harald Grallert,
    • Christian Gieger &
    • Thomas Illig
  122. Institute for Biometrics and Epidemiology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany

    • Wolfgang Rathmann
  123. Department of Public Health, Section of General Practice, Aarhus University, Aarhus, Denmark

    • Torsten Lauritzen
  124. Department of Clinical Experimental Research, Rigshospitalet, Glostrup, Denmark

    • Allan Linneberg
  125. Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark

    • Allan Linneberg
  126. Department of Clinical Sciences, Hypertension and Cardiovascular Disease, Lund University, Malmö, Sweden

    • Olle Melander
  127. Oxford NIHR Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, UK

    • Matt Neville,
    • Fredrik Karpe,
    • Katharine R. Owen,
    • Anna L. Gloyn &
    • Mark I. McCarthy
  128. Department of Clinical Sciences, Diabetes and Cardiovascular Disease, Genetic Epidemiology, Lund University, Malmö, Sweden

    • Marju Orho-Melander
  129. Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts, USA

    • Lu Qi,
    • Qibin Qi,
    • Frank B. Hu &
    • Paul W. Franks
  130. Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA

    • Lu Qi
  131. Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, USA

    • Qibin Qi
  132. Department of Endocrinology and Diabetology, Medical Faculty, Heinrich-Heine University, Düsseldorf, Germany

    • Michael Roden
  133. Department of Public Health and Clinical Medicine, Umeå University, Umeå, Sweden

    • Olov Rolandsson &
    • Paul W. Franks
  134. High Throughput Genomics, Oxford Genomics Centre, Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK

    • Christine Blancher,
    • Gemma Buck,
    • Joseph Trakalo &
    • David Buck
  135. Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

    • Martin Hrabé de Angelis
  136. Center of Life and Food Sciences Weihenstephan, Technische Universität München, Freising-Weihenstephan, Germany

    • Martin Hrabé de Angelis
  137. William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK

    • Panos Deloukas
  138. Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia

    • Panos Deloukas
  139. Department of Clinical Sciences, Medicine, Lund University, Malmö, Sweden

    • Peter Nilsson
  140. Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark

    • Torben Hansen
  141. Department of Social Services and Health Care, Jakobstad, Finland

    • Bo Isomaa
  142. Metabolic Research Laboratories, Institute of Metabolic Science, University of Cambridge, Cambridge, UK

    • Stephen P. O’Rahilly &
    • Inês Barroso
  143. Pat Macpherson Centre for Pharmacogenetics and Pharmacogenomics, Ninewells Hospital and Medical School, University of Dundee, Dundee, UK

    • Colin N. A. Palmer
  144. Foundation for Research in Health, Exercise and Nutrition, Kuopio Research Institute of Exercise Medicine, Kuopio, Finland

    • Rainer Rauramaa
  145. Center for Vascular Prevention, Danube University Krems, Krems, Austria

    • Jaakko Tuomilehto
  146. Diabetes Research Group, King Abdulaziz University, Jeddah, Saudi Arabia

    • Jaakko Tuomilehto
  147. Instituto de Investigacion Sanitaria del Hospital Universario LaPaz (IdiPAZ), University Hospital LaPaz, Autonomous University of Madrid, Madrid, Spain

    • Jaakko Tuomilehto
  148. National Institute for Health and Welfare, Helsinki, Finland

    • Jaakko Tuomilehto &
    • Veikko Salomaa
  149. Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA

    • Richard M. Watanabe
  150. Department of Physiology & Biophysics, Keck School of Medicine, University of Southern California, Los Angeles, California, USA

    • Richard M. Watanabe
  151. Diabetes and Obesity Research Institute, Keck School of Medicine, University of Southern California, Los Angeles, California, USA

    • Richard M. Watanabe
  152. Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala, Sweden

    • Ann-Christine Syvänen
  153. Cedars-Sinai Diabetes and Obesity Research Institute, Los Angeles, California, USA

    • Richard N. Bergman
  154. Functional Genomics Unit, CSIR-Institute of Genomics & Integrative Biology (CSIR-IGIB), New Delhi, India

    • Dwaipayan Bharadwaj
  155. Department of Biomedical Science, Hallym University, Chuncheon, South Korea

    • Yoon Shin Cho
  156. CSIR-Centre for Cellular and Molecular Biology, Hyderabad, Telangana, India

    • Giriraj R. Chandak
  157. Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, China

    • Juliana C. N. Chan &
    • Ronald C. W. Ma
  158. Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Hong Kong, China

    • Juliana C. N. Chan &
    • Ronald C. W. Ma
  159. MRC-PHE Centre for Environment and Health, Imperial College London, London, UK

    • Paul Elliott
  160. The Biostatistics Center, The George Washington University, Rockville, Maryland, USA

    • Kathleen A. Jablonski
  161. Department of Medicine, Division of Endocrinology, Diabetes and Nutrition, and Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, Maryland, USA

    • Toni I. Pollin
  162. Department of Endocrinology and Metabolism, All India Institute of Medical Sciences, New Delhi, India

    • Nikhil Tandon
  163. Department of Genomics of Common Disease, School of Public Health, Imperial College London, London, UK

    • Philippe Froguel &
    • Inga Prokopenko
  164. Life Sciences Institute, National University of Singapore, Singapore

    • Yik Ying Teo
  165. Department of Statistics and Applied Probability, National University of Singapore, Singapore

    • Yik Ying Teo
  166. Endocrinology and Metabolism Service, Hadassah-Hebrew University Medical Center, Jerusalem, Israel

    • Benjamin Glaser
  167. The Medical School, Institute of Cellular Medicine, Newcastle University, Newcastle, UK

    • Mark Walker
  168. Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden

    • Erik Ingelsson
  169. Hannover Unified Biobank, Hannover Medical School, Hanover, Germany

    • Thomas Illig
  170. Institute for Human Genetics, Hannover Medical School, Hanover, Germany

    • Thomas Illig
  171. Department of Medical Sciences, Uppsala University, Uppsala, Sweden

    • Lars Lind
  172. Data Sciences and Data Engineering, Broad Institute, Cambridge, Massachusetts, USA

    • Yossi Farjoun
  173. Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland

    • Tiinamaija Tuomi &
    • Leif Groop
  174. Imperial College Healthcare NHS Trust, Imperial College London, London, UK

    • Jaspal Singh Kooner &
    • John C. Chambers
  175. Clinical Research Centre, Centre for Molecular Medicine, Ninewells Hospital and Medical School, Dundee, UK

    • Andrew D. Morris
  176. The Usher Institute to the Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK

    • Andrew D. Morris
  177. University of Exeter Medical School, University of Exeter, Exeter, UK

    • Andrew T. Hattersley
  178. Department of Natural Science, University of Haifa, Haifa, Israel

    • Gil Atzmon
  179. Institute of Human Genetics, Technische Universität München, Munich, Germany

    • Tim M. Strom &
    • Thomas Meitinger
  180. Departments of Medicine and Human Genetics, The University of Chicago, Chicago, Illinois, USA

    • Graeme I. Bell
  181. Cardiovascular & Metabolic Disorders Program, Duke-NUS Medical School Singapore, Singapore

    • E. Shyong Tai
  182. Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK

    • Gilean McVean
  183. Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, Mississippi, USA

    • James G. Wilson
  184. Department of Laboratory Medicine & Institute for Human Genetics, University of California, San Francisco, San Francisco, California, USA

    • Mark Seielstad
  185. Blood Systems Research Institute, San Francisco, California, USA

    • Mark Seielstad
  186. General Medicine Division, Massachusetts General Hospital and Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA

    • James B. Meigs
  187. Division of Endocrinology and Metabolism, Department of Medicine, McGill University, Montreal, Quebec, Canada

    • Rob Sladek
  188. Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA

    • Eric S. Lander
  189. Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, USA

    • Karen L. Mohlke
  190. Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA

    • Jose C. Florez &
    • David Altshuler
  191. Diabetes Research Center (Diabetes Unit), Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA

    • Jose C. Florez &
    • David Altshuler
  192. Department of Biostatistics, University of Liverpool, Liverpool, UK

    • Andrew P. Morris
  193. Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

    • David Altshuler

Contributions

Author contributions are described in the Supplementary Information.

Competing financial interests

R.A.D. has been a member of advisory boards for Astra Zeneca, Novo Nordisk, Janssen, Lexicon and Boehringer-Ingelheim; received research support from Bristol Myers Squibb, Boehringer- Ingelheim, Takeda and Astra Zeneca; and is a member of speakers’ bureaus for Novo-Nordisk and Astra Zeneca. J.C.F. has received consulting honoraria from Pfizer and PanGenX. M.I.M. has received consulting and advisory board honoraria from Pfizer, Lilly, and NovoNordisk. G.M. and P.D. are co-founders of Genomics PLC, which provides genome analytics. D.A. is an employee of and holds equity in Vertex Pharmaceuticals.

Corresponding authors

Correspondence to:

Whole-genome sequence data from the GoT2D project are available by application to the European Genome-Phenome Archive (https://www.ebi.ac.uk/ega/home) under accession number EGAS00001001459 and from dbGAP (http://www.ncbi.nlm.nih.gov/gap) under accession number phs000840.v1.p1. Whole-exome sequence data from the T2D-GENES project are available from the European Genome-Phenome Archive (https://www.ebi.ac.uk/ega/home) under accession number EGAS00001001460 and from dbGAP (http://www.ncbi.nlm.nih.gov/gap) under accession numbers phs000847.v1.p1, phs001093.v1.p1, phs001095.v1.p1, phs001096.v1.p1, phs001097.v1.p1, phs001098.v1.p1, phs001099.v1.p1, phs001100.v1.p1 and phs001102.v1.p1. Summary-level data from the exome array component of this project (and from the exome and genome sequences) can be freely accessed at the Accelerating Medicines Partnership T2D portal (http://www.type2diabetesgenetics.org), and similar data from the GoT2D-imputed data at http://www.diagram-consortium.org.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Summary of samples and quality control procedures. (470 KB)

    This figure summarizes data generation for whole-genome sequencing (GoT2D), exome sequencing (GoT2D and T2D-GENES), exome array genotyping (DIAGRAM), and GWAS imputation (DIAGRAM).

  2. Extended Data Figure 2: Power for single and aggregate variant association. (493 KB)

    a–g, Power to detect single-variant association (α = 5 × 10−8) at varying minor allele frequencies (x-axis) and allelic ORs (y-axis) for seven effective sample size (Neff) scenarios relevant to the genomes (ac) and exomes (dg) components of this project. a, Variant observed in 2,657 samples (the effective size of the GoT2D integrated panel). b, Variant observed in 28,350 samples (the effective size of the imputed data set). c, Variant observed in the GoT2D integrated panel and the imputed data set (effective sample size 31,007). d, Ancestry-specific variant in 2,000 samples (the size of each of the non-European exome sequence data sets). e, European-specific variant in 5,000 samples (the combined size of the European exome sequence data sets). f, Variant observed with shared frequency across all ancestry groups in 12,940 samples (the size of the combined exome sequence data set). g, Variant observed in the combined exome array and sequencing data set (effective sample size 82,758). h, i, Power for gene-based test of association (SKAT-O) according to liability variance explained. In h, 50% of the variants contribute to disease risk and the remaining 50% have no effect on disease risk; in i, 100% of the variants contribute to disease risk. For each, sample sizes considered are 2,000 (ancestry-specific effects; green) and 12,940 (ancestry-shared effects; blue). Power is shown for two levels of significance (α = 2.5 × 10−6 and α = 0.001). From these simulation studies, it is clear that under the optimistic model, where effects are shared across all ethnicities (blue line) and all variants contribute, power is >60% for 1% variance explained and α = 2.5 × 10−6. However, power declines rapidly if either criterion is relaxed.

  3. Extended Data Figure 3: Single variant analyses. (282 KB)

    ac, Manhattan plot of single-variant analyses generated from exome sequence data in 6,504 cases and 6,436 controls of African American, East Asian, European, Hispanic, and South Asian ancestry (a); exome array genotypes in 28,305 cases and 51,549 controls of European ancestry (b); and combined meta-analysis of exome array and exome sequence samples (c). Coding variants are categorized according to their relationships to the previously reported lead variant from GWAS region. Loci achieving genome-wide significance only in the combined analysis are highlighted in bold. The HNF1A variant reaching genome-wide significance in the combined analysis is a synonymous variant (Thr515Thr). The dashed horizontal line in each panel designates the threshold for genome-wide significance (P < 5 × 10−8).

  4. Extended Data Figure 4: Classification of coding variants according to their relationship to reported lead variants for each GWAS region. (715 KB)

    The ideogram shows the location of 25 coding variant associations at 16 loci described in the text. The number in each circle corresponds to the number of associated variants at each locus. Variants are grouped into five categories based on inferred relationship with the GWAS lead variant. For some of these categories, the figure includes representative regional association plots based on exome array meta-analysis data from 28,305 cases and 51,549 controls. The locus displayed for each category is designated in bold. The first plot in each panel shows the unconditional association results; the middle plot the association results after conditioning on the non-coding GWAS SNP; and the last plot the results after conditioning on the most significantly associated coding variant. Each point represents an SNP in the exome array meta-analysis, plotted with its P value (on a –log10 scale) as a function of the genomic position (hg19). In each panel, the lead coding variant is represented by the purple symbol. The colour-coding of all other SNPs indicates LD with the lead SNP (estimated by European r2 from 1000G March 2012 reference panel: red r2 ≥ 0.8; gold 0.6 ≤ r2 < 0.8; green 0.4 ≤ r2 < 0.6; cyan 0.2 ≤ r2 < 0.4; blue r2 < 0.2; grey r2 unknown). Gene annotations are taken from the University of California Santa Cruz genome browser. GWS: genome-wide significance. *Seven variants, three at ASCC2, and one each at THADA, TSPAN8, FES and HNF4A did not achieve genome-wide significance themselves, but are included because they fall into genes and/or regions with other significant association signals (see text).

  5. Extended Data Figure 5: Exclusion of synthetic associations and construction of credible causal variant sets at T2D GWAS loci. (160 KB)

    Ten T2D GWAS loci were selected for synthetic association testing (P < 0.001; see Methods). a, The effect size observed at the GWAS index SNV (sequence data) before (navy blue) and after (light blue, grey) conditioning on candidate rare and low-frequency (MAF <5%) variants which could produce synthetic association. b, Example of synthetic association exclusion at the TCF7L2 locus. Error bars represent 95% confidence intervals for the index SNP odds ratio as rare variants are greedily added to the model. c, The size of credible sets at T2D GWAS loci when constructed from the GoT2D data, compared to the sizes when restricted to variants in the 1000G or HapMap data.

  6. Extended Data Figure 6: Genome enrichment analysis in GoT2D whole genome sequence data. (383 KB)

    n = 2,657. a, Functional annotation categories were defined using transcription, chromatin state and transcription factor binding data from GENCODE, ENCODE and other studies. b, T2D association statistics for variants at each T2D locus were jointly modelled with functional annotation using fgwas. In the resulting model we identified enrichment of coding exons (CDS), transcription factor binding sites (TFBS), mature adipose active enhancers and promoters (hASC-t4 EnhA, TssA), pancreatic islet active and weak enhancers (HI EnhA, EnhWk), pre-adipose active and weak enhancers (hASC-t1 EnhA, EnhWk), embryonic stem cell active promoters (H1-hESC TssA) and 5′UTRs. Dots represent enrichment estimates and horizontal lines the 95% confidence intervals. c, At the CCND2 locus, three variants not present in HapMap2 have a combined 90% posterior probability of being causal (rs4238013, rs3217801, rs73040004). One of these variants, rs3217801, is a 2-bp indel that overlaps an islet enhancer element.

  7. Extended Data Figure 7: Low frequency variants in exome array data. (533 KB)

    Results from meta-analysis of 43,045 low-frequency and common coding variants on the exome array (assayed in 79,854 European subjects). a, Observed allelic ORs as a property of allele MAF. Variants missing in more than eight cohorts or polymorphic in only one cohort were excluded. Coloured lines represent contours for liability variance explained. Regions shaded grey denote ranges of OR and MAF consistent with 80% power (in this case, at α = 5 × 10−7) to detect single-variant associations in this data set (given the observed range of missing data). Variants with a black collar are those highlighted by a bounding analysis as having a probability >0.8 of having liability-scale variance (LVE) > 0.1%. b, Distribution of each variant in the MAF/OR space was computed by assuming T2D prevalence of 8% and a beta and normal distribution for MAF and OR, respectively. Probability is obtained by integrating the joint MAF–OR distributions over ranges of LVE. c, Single variant association, liability and bounding results for the known T2D GWAS variants on the exome array (see Methods).

Extended Data Tables

  1. Extended Data Table 1: Summary information for sample sets used in the association analyses (545 KB)
  2. Extended Data Table 2: Counts and properties of variants identified in sequenced subjects (440 KB)
  3. Extended Data Table 3: Characterization of variant associations through conditional analysis (553 KB)
  4. Extended Data Table 4: Testing for synthetic associations across GWAS-identified T2D loci (248 KB)

Supplementary information

PDF files

  1. Supplementary Information (22.5 MB)

    This file contains Supplementary Tables and Figures 1– 32 (see separate excel file for Supplementary Table 20) and Author contribution and acknowledgement lists.

Excel files

  1. Supplementary Table 20 (77 KB)

    This file contains an Overview of 634 genes at 81 GWAS-identified T2D loci.

Additional data