The largest DNA-sequencing study of type 2 diabetes conducted so far concludes that, contrary to expectation, low-frequency and rare genetic variants do not contribute significantly to disease risk. See Article p.41
Type 2 diabetes is a major cause of illness and death, particularly in people of African, Hispanic and Asian ancestry. Despite the indications of strong familial origins1, it was not until 2012 that genetic variants associated with the disease were robustly established, thanks to a genome-wide association study (GWAS) that looked for common risk variants in more than 100,000 people from several ancestral populations2. Nonetheless, only around 10% of the risk attributable to genetic factors has been identified. An obvious next approach is to interrogate the genome for infrequent and rare variants that could affect risk individually or in aggregate. On page 41, Fuchsberger et al.3 present a comprehensive evaluation of the role of rare and infrequent variants in the risk of developing type 2 diabetes.
The authors sequenced exomes, which encompass protein-coding regions (about 1–2% of the human genome), from around 6,500 people with type 2 diabetes and 6,400 healthy controls, from 5 ancestry groups. They also sequenced whole genomes from some 1,300 people with the disease and 1,300 ancestry-matched controls. After rigorous quality control, the whole-genome sequences revealed approximately 27 million sequences or bases that varied between individuals.
Fuchsberger and colleagues found that 126 of these variants, each located in one of four genes, were significantly associated with an altered risk of type 2 diabetes. Two of the genes, TCF7L2 and ADCY5, had been previously identified as containing commonly occurring variants associated with diabetes risk, and a risk-associated variant in a third, CCND2, had been found to occur at low frequency4. The final gene, EML4, contained a common variant that had not been identified from the GWAS — but this discovery was not replicated in a larger data set.
Crucially, there was no significant evidence for rare or low-frequency disease-associated variants in regulatory elements, which modulate gene expression, or in coding sequences. Combining samples that had undergone both exome and whole-genome sequencing revealed only one more risk variant, in the gene PAX4; this variant had been previously detected in people from East Asia5,6.
To broaden their search, the authors increased the number of subjects to around 90,000. They analysed the participants' DNA using a customized array — a tool that allowed the analysis of specific coding sequences in which diabetes-associated variation might arise, thereby generating fewer raw data than genome sequencing. In this expanded data set, they found another 18 common variants in 13 genes. Only one of these, in MTMR3, had not previously been identified by the GWAS. Fuchsberger and colleagues conclude that there is little evidence that rare or infrequent variants affect the risk of developing diabetes. Instead, the authors suggest that almost all significantly diabetes-associated variants are common in the population and have been previously detected by the GWAS.
If the contribution of common variants to the genetic risk of diabetes is relatively limited, and if there is little support for a contribution from rare and low-frequency variants, where are the culprits hiding? Once dubbed “a geneticist's nightmare”7, diabetes seems to be living up to its reputation. What are researchers missing? Could other genetic tools be applied?
One approach to improving the search would be to add more whole-genome sequences, in the hope that a larger population will enable smaller effects to be detected. Size does matter in genomic studies, but the current work suggests that increased sample size will not much increase the number of risk-associated genes or variants. However, the cost of increasing sample size might be surprisingly low. Several initiatives from the US National Institutes of Health, including the Trans-Omics for Precision Medicine program, the Centers for Common Disease Genomics project and the Personalized Medicine Initiative, will provide whole-genome sequences and associated data on phenotypes (the severity of a range of diabetes-associated traits) free of charge. Even if the expected yield of such analyses is low, finding a handful of rare variants — for example, those that confer loss of gene function, such as one found in PCSK9 (ref. 8) — could have a major impact.
There are many alternatives to increasing sample size. These include focusing only on the variants that confer an increased risk of harmful phenotypes9, analysing diverse populations or those known to be at low risk of disease10, and considering other genetic 'architectures', such as whether the disease is caused by a modest contribution from many common variants, rather than a large contribution from a few rare ones. Moreover, genetic risk should be considered in the context of the complex environmental risk factors with which it is inexorably intertwined. For example, being overweight does not always lead to diabetes, but fatty tissue increases insulin resistance, and abdominal fat increases the risk of diabetes more than does fat in hips and thighs. Physical activity not only controls weight, but also uses glucose as energy and increases cellular insulin sensitivity. An understanding of genetic risk factors is needed to elucidate the mechanisms that lead to such complex, variable phenotypes.
Many genes and variants, both common and rare, could influence the declining function of the β-cells that store and release insulin. However, relatively small sets of variants might be sufficient to elevate risk in the context of other genetic or environmental factors. There might be cassettes of variants unique to an individual — called private variants — that lead to diabetes only in certain conditions. In this scenario, the large case–control design is not necessarily optimal, because the averaging of effects will obscure crucial gene sets. Family-based designs11 and efforts to further dissect the subtleties of each phenotype might be required to identify private variants9.
Historically, to obtain a grasp of risk-associated variants, genetic studies took advantage of extremes of disease presentation (such as mild or severe phenotypes, or early or late diagnosis12), or of the occurrence of disease in low-risk groups, or of populations in which private variants might contribute to disease risk. Although such studies have already been performed, it might be time to revisit these approaches using whole-genome-sequence and refined phenotypic data.
The authors' study marks the first large-scale use of whole-genome sequencing data to tackle this incredibly complex disease. The conclusion that rare and infrequent variants have little effect on risk under this study design and in these populations is important. However, it may be that Fuchsberger et al. have eliminated only the rare variants identifiable from a case–control study. The genetics of type 2 diabetes might still be a nightmare, but nonetheless the search continues. Footnote 1
Rewers, M. & Hamman, R. F. in Diabetes in America 2nd edn, Ch. 9, 179–220 (NIH, 1995).
Morris, A. P. et al. Nature Genet. 44, 981–990 (2012).
Fuchsberger, C. et al. Nature 536, 41–47 (2016).
Steinthorsdottir, V. et al. Nature Genet. 46, 294–298 (2014).
Cho, Y. S. et al. Nature Genet. 44, 67–72 (2012).
Ma, R. C. W. et al. Diabetologia 56, 1291–1305 (2013).
Neel, J. V. in The Genetics of Diabetes Mellitus (eds Creutzfeldt, W., Köbberling, J. & Neel, J. V.) 1–11 (Springer, 1976).
Cohen, J. C., Boerwinkle, E., Mosley, T. H. Jr & Hobbs, H. H. N. Engl. J. Med. 354, 1264–1272 (2006).
Bergman, R. N. et al. Diabetes 52, 2168–2174 (2003).
Zeggini, E., Gloyn, A. L. & Hansen, T. Diabetologia 59, 938–941 (2016)
Blangero, J. & Kent, J. W. Jr in Genetic and Molecular Aspects of Sport Performance (eds Bouchard, C. & Hoffman, E. P.) 33–45 (Wiley-Blackwell, 2011).
Lange, L. A. et al. Am. J. Hum. Genet. 94, 233–245 (2014).
Related links in Nature Research
About this article
Intolerance of loud sounds in childhood: Is there an intergenerational association with grandmaternal smoking in pregnancy?
PLOS ONE (2020)
Ancestral childhood environmental exposures occurring to the grandparents and great-grandparents of the ALSPAC study children
Wellcome Open Research (2020)
Molecular Metabolism (2019)