Analyses of genome-wide association studies (GWAS) show that common SNPs can account for the majority of the heritability of complex traits, but that there are likely to be limits to the usefulness of the current strategy of accumulating common variants of small effect for risk prediction. The ongoing success of GWAS has implications for functional characterization of trait-associated loci.
A quick glance at our table of contents will show that the bumper harvest of common variants associated with disease and with medically relevant traits continues year round and unabated. What, then, are the expected limits to the strategy? How big an investment should be made in ever-larger studies? What might be the balance of expected immediate benefits in the form of risk prediction against those biological insights requiring follow-on studies?
These questions are addressed here in Analyses by Ju-Hyun Park and colleagues (p. 570), and Jian Yang and colleagues (p. 565), as well as in a News and Views by Greg Gibson (p. 558). According to Yang et al., there is no reason to propose that any heritable variance is 'missing'. Consistent effects of epigenetic variants, gene-environment interactions and non-additive effects of combinations of variants can be excluded from consideration, as these factors are taken into account when defining heritable genetic variance. Instead, many more common variants can be anticipated to contribute to each trait, albeit with too small an effect size to reach significance in genome-wide screens. Rare variants of large effect cannot explain very much heritability, but Yang et al. predict that causative variants may well be distributed with lower minor allele frequencies than the GWAS SNPs that tag them. Park et al. point out that fine-mapping studies have so far failed to find common variants with larger effect sizes than their tagging SNPs, and they propose extending their method to predict the yield of rarer variants genome wide in the next set of experiments sequencing across GWAS loci.
Given the finding of Park et al. and others that there are diminishing returns in predicting disease risk from common marker SNPs, it seems sensible to invest a greater effort in a functional investigation of the hundreds of biological hypotheses turned up by GWAS in the last four years and to do so in a way that preserves the high standards developed by the community.
In evaluating articles for review for this journal, we do not currently apply rigid editorial criteria for functional analysis of loci discovered via GWAS because we do not want to stifle innovation and discovery at the expense of rigor. Rather, we consider whether integrative approaches preserve or destroy essential information gathered in foregoing experiments (Nat. Genet. 42, 1, 2010). In effect, we are interested to know how well the precautions and standards of the GWAS practitioners propagate into the ensuing functional experiments. These principles lead us to consider follow-on experiments in several tiers, according to the levels of evidence provided.
The first set of functional investigations that follow from a replicated GWAS experiment need only use the replicated SNPs and regions of linkage disequilibrium (LD) independent of annotation or relationship to nearby genes. These investigations include fine mapping, genomic analysis of gene expression in human tissues, investigation of risk SNPs as modifiers of monogenic traits, screens for somatic mutations on marker-containing haplotypes, case-control studies of all variants within the region of LD and epigenetic analysis of human tissues using the GWAS SNPs and their LD regions.
In contrast, there is a second set of functional investigations that only become possible once a gene or genes have been securely implicated. In complex traits, demonstrating causation may be impossible, but there should at least be evidence that the variants identified in a GWAS show a consistent correlation with gene expression or gene regulation that is also highly correlated with the human trait originally assayed. At this point, other systems can be used experimentally to provide functional support: for example, mapping mammalian quantitative trait loci in the syntenic region, investigating deletions of the syntenic region, natural variants and mouse knockouts of nearby genes, fish or cell knockdown of a nearby gene and in vitro expression studies with constructs of nearby genes. A further level of proof may be needed before pathway analysis is attempted because multiple genes across the genome will need to be functionally implicated. The construction of a network of marker SNPs and loci may not be secure, even for hypothesis generation, unless it can be shown that there is a way to validate such a model after a transparent process of multiple hypothesis testing.
We welcome the reassuring message from the modelers that the sky is not falling, heritability is not missing and the models currently being used provide quantitative guidance concerning investment to benefit ratios of larger studies with common marker variants. We look forward to similarly quantitative predictions recommending where to focus screening effort once data on rarer variants is available. Finally, we endorse the recommendation of Park et al. that because more complex methods are now being used in association studies, a power calculation must be included in papers to be published in the journal.
About this article
Cite this article
On beyond GWAS. Nat Genet 42, 551 (2010). https://doi.org/10.1038/ng0710-551
Nature Reviews Genetics (2017)
Translational Psychiatry (2016)
Integrative mutation, haplotype and G × G interaction evidence connects ABGL4, LRP8 and PCSK9 genes to cardiometabolic risk
Scientific Reports (2016)
Correlation between genetic polymorphisms within IL-1B and TLR4 genes and cancer risk in a Russian population: a case-control study
Tumor Biology (2014)
Molecular pathological epidemiology of epigenetics: emerging integrative science to analyze environment, host, and disease
Modern Pathology (2013)