Epigenetic variation affects genome function and hence can contribute to common disease. To establish a possible link requires systematic studies, such as the proposed epigenome-wide association studies (EWASs).
Of the many epigenetic marks, DNA methylation (DNAm) is the most stable and accessible and therefore ideally suited for EWASs.
In principle, EWASs should be equally successful as genome-wide association studies (GWASs) for the identification of disease-associated variations. However, there are fundamental differences between GWASs and EWASs that need to be considered for appropriate study design.
The key differences for EWASs are tissue specificity and the possibility that some epigenetic changes may occur downstream of the disease process. Both considerations affect the type of cohorts and samples that should be analyzed.
Technologies for EWASs are readily available for both array- and sequencing-based platforms but many of the computational and statistical analysis methods remain to be developed.
At this early stage, it is challenging to predict the possible effect of DNAm variation. However, if it does exist and if the right study design is used, then much more than the 'low-hanging fruit' should be detectable in fewer samples than are required for a typical GWAS, based on simulations assuming a conservative methylation odds ratio.
Despite the success of genome-wide association studies (GWASs) in identifying loci associated with common diseases, a substantial proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation. Such epigenome-wide association studies (EWASs) present novel opportunities but also create new challenges that are not encountered in GWASs. We discuss EWAS design, cohort and sample selections, statistical significance and power, confounding factors and follow-up studies. We also discuss how integration of EWASs with GWASs can help to dissect complex GWAS haplotypes for functional analysis.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009).
Feinberg, A. P. & Irizarry, R. A. Evolution in health and medicine Sackler colloquium: Stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease. Proc. Natl Acad. Sci. USA 107 (Suppl. 1), 1757–1764 (2010). This paper proposes a mechanism whereby genetic variants that do not change the mean phenotype could change the variability of the phenotype, which could be mediated epigenetically.
Petronis, A. Epigenetics as a unifying principle in the aetiology of complex traits and diseases. Nature 465, 721–727 (2010).
Kulis, M. & Esteller, M. DNA methylation and cancer. Adv. Genet. 70, 27–56 (2010).
Bernstein, B. E., Meissner, A. & Lander, E. S. The mammalian epigenome. Cell 128, 669–681 (2007).
MacArthur, D. Why do genome-wide scans fail? Genetic Future [online], (2008).
Ramsahoye, B. H. et al. Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc. Natl Acad. Sci. USA 97, 5237–5242 (2000).
Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009). This paper describes the first human methylome to be mapped at single-base resolution, demonstrating extensive DNAm at non-CpG sites in stem cells.
Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).
Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).
Veron, N. & Peters, A. H. Epigenetics: Tet proteins in the limelight. Nature 473, 293–294 (2011).
Zaratiegui, M., Irvine, D. V. & Martienssen, R. A. Noncoding RNAs and gene silencing. Cell 128, 763–776 (2007).
Rassoulzadegan, M. et al. RNA-mediated non-mendelian inheritance of an epigenetic change in the mouse. Nature 441, 469–474 (2006).
Rakyan, V. K. et al. DNA methylation profiling of the human major histocompatibility complex: a pilot study for the human epigenome project. PLoS Biol. 2, e405 (2004). This is the first systematic study of DNAm profiles in the human genome.
Frigola, J. et al. Epigenetic remodeling in colorectal cancer results in coordinate gene suppression across an entire chromosome band. Nature Genet. 38, 540–549 (2006).
Irizarry, R. A. et al. The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nature Genet. 41, 178–186 (2009).
Edwards, J. R. et al. Chromatin and sequence features that define the fine and gross structure of genomic methylation patterns. Genome Res. 20, 972–980 (2010).
Fabris, S. et al. Biological and clinical relevance of quantitative global methylation of repetitive DNA sequences in chronic lymphocytic leukemia. Epigenetics 6, 188–194 (2011).
Lechner, M., Boshoff, C. & Beck, S. Cancer epigenome. Adv. Genet. 70, 247–276 (2010).
Ting, D. T. et al. Aberrant overexpression of satellite repeats in pancreatic and other epithelial cancers. Science 331, 593–596 (2011).
Feber, A. et al. Comparative methylome analysis of benign and malignant peripheral nerve sheath tumors. Genome Res. 21, 515–524 (2011).
Javierre, B. M. et al. Changes in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus. Genome Res. 20, 170–179 (2010).
Nguyen, A., Rauch, T. A., Pfeifer, G. P. & Hu, V. W. Global methylation profiling of lymphoblastoid cell lines reveals epigenetic contributions to autism spectrum disorders and a novel autism candidate gene, RORA, whose protein product is reduced in autistic brain. FASEB J. 24, 3036–3051 (2010).
Bach, J. F. The effect of infections on susceptibility to autoimmune and allergic diseases. N. Engl. J. Med. 347, 911–920 (2002).
Barker, D. J. Maternal nutrition, fetal nutrition, and disease in later life. Nutrition 13, 807–813 (1997).
Thompson, R. F. et al. Experimental intrauterine growth restriction induces alterations in DNA methylation and gene expression in pancreatic islets of rats. J. Biol. Chem. 285, 15111–15118 (2010).
Heijmans, B. T. et al. Persistent epigenetic differences associated with prenatal exposure to famine in humans. Proc. Natl Acad. Sci. USA 105, 17046–17049 (2008).
Ng, S. F. et al. Chronic high-fat diet in fathers programs beta-cell dysfunction in female rat offspring. Nature 467, 963–966 (2010).
Rakyan, V. K. et al. Transgenerational inheritance of epigenetic states at the murine AxinFu allele occurs after maternal and paternal transmission. Proc. Natl Acad. Sci. USA 100, 2538–2543 (2003).
Morgan, H. D., Sutherland, H. G., Martin, D. I. & Whitelaw, E. Epigenetic inheritance at the agouti locus in the mouse. Nature Genet. 23, 314–318 (1999).
Fraga, M. F. et al. Epigenetic differences arise during the lifetime of monozygotic twins. Proc. Natl Acad. Sci. USA 102, 10604–10609 (2005).
Kaminsky, Z. A. et al. DNA methylation profiles in monozygotic and dizygotic twins. Nature Genet. 41, 240–245 (2009). These two papers represent key analyses of DNAm differences between monozygotic twin pairs. They provided first evidence for epigenetic metastability in humans that is unlikely to be explained by genetic heterogeneity.
Christensen, B. C. et al. Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genet. 5, e1000602 (2009).
Zhang, D. et al. Genetic control of individual differences in gene-specific methylation in human brain. Am. J. Hum. Genet. 86, 411–419 (2010).
Kerkel, K. et al. Genomic surveys by methylation-sensitive SNP analysis identify sequence-dependent allele-specific DNA methylation. Nature Genet. 40, 904–908 (2008). This was the first genome-wide survey to establish sequence-dependent ASM to be a recurrent phenomenon outside imprinted regions. This finding has implications for mapping and interpreting associations of non-coding SNPs and haplotypes with human phenotypes.
Hellman, A. & Chess, A. Extensive sequence-influenced DNA methylation polymorphism in the human genome. Epigenetics Chromatin 3, 11 (2010).
Gibbs, J. R. et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 6, e1000952 (2010).
Shoemaker, R., Deng, J., Wang, W. & Zhang, K. Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res. 20, 883–889 (2010).
Bell, J. T. et al. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol. 12, R10 (2011).
Feinberg, A. P. et al. Personalized epigenomic signatures that are stable over time and covary with body mass index. Sci. Transl. Med. 2, 49ra67 (2010).
Bell, C. G. et al. Genome-wide DNA methylation analysis for diabetic nephropathy in type 1 diabetes mellitus. BMC Med. Genomics 3, 33 (2010).
Mill, J. et al. Epigenomic profiling reveals DNA-methylation changes associated with major psychosis. Am. J. Hum. Genet. 82, 696–711 (2008).
Baylin, S. & Bestor, T. H. Altered methylation patterns in cancer cell genomes: cause or consequence? Cancer Cell 1, 299–305 (2002).
Laird, P. W. Principles and challenges of genome-wide DNA methylation analysis. Nature Rev. Genet. 11, 191–203 (2010).
Harris, R. A. et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nature Biotech. 28, 1097–1105 (2010).
Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nature Biotech. 28, 1106–1114 (2010). These two papers benchmarked and compared six of the most commonly used methods for DNAm analysis.
Beck, S. Taking the measure of the methylome. Nature Biotech. 28, 1026–1028 (2010).
Ulrey, C. L., Liu, L., Andrews, L. G. & Tollefsbol, T. O. The impact of metabolism on DNA methylation. Hum. Mol. Genet. 14 (Suppl. 1), R139–R147 (2005).
Widschwendter, M. et al. Association of breast cancer DNA methylation profiles with hormone receptor status and response to tamoxifen. Cancer Res. 64, 3807–3813 (2004).
Carone, B. R. et al. Paternally induced transgenerational environmental reprogramming of metabolic gene expression in mammals. Cell 143, 1084–1096 (2010).
Bell, J. T. & Spector, T. D. A twin approach to unraveling epigenetics. Trends Genet. 27, 116–125 (2011).
Pearson, H. Epidemiology: study of a lifetime. Nature 471, 20–24 (2011).
Yamagata, K. DNA methylation profiling using live-cell imaging. Methods 52, 259–266 (2010).
Paliwal, A., Vaissiere, T. & Herceg, Z. Quantitative detection of DNA methylation states in minute amounts of DNA from body fluids. Methods 52, 242–247 (2010).
Levenson, V. V. DNA methylation as a universal biomarker. Expert Rev. Mol. Diagn. 10, 481–488 (2010).
Wang, W. Y., Barratt, B. J., Clayton, D. G. & Todd, J. A. Genome-wide association studies: theoretical and practical concerns. Nature Rev. Genet. 6, 109–118 (2005).
Li, Y. et al. The DNA methylome of human peripheral blood mononuclear cells. PLoS Biol. 8, e1000533 (2010).
Breitling, L. P., Yang, R., Korn, B., Burwinkel, B. & Brenner, H. Tobacco-smoking-related differential DNA methylation: 27k discovery and replication. Am. J. Hum. Genet. 88, 450–457 (2011). This is the first example of a well-designed EWAS. The authors used a combination of a discovery cohort and technical validation using a different platform, followed by replication, to identify a single CpG site that displays an extremely significant correlation with smoking status.
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Stephens, M. & Balding, D. J. Bayesian statistical methods for genetic association studies. Nature Rev. Genet. 10, 681–690 (2009).
Hoggart, C. J., Clark, T. G., De Iorio, M., Whittaker, J. C. & Balding, D. J. Genome-wide significance for dense SNP and resequencing data. Genet. Epidemiol. 32, 179–185 (2008).
McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008).
Clayton, D. G. et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nature Genet. 37, 1243–1246 (2005).
Astle, W. & Balding, D. J. Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24, 11 (2009).
Teschendorff, A. E. et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res. 20, 440–446 (2010).
van Belle, G. Statistical Rules of Thumb 2nd edn (Wiley, Hoboken, New Jersey, 2008).
Chanock, S. J. et al. Replicating genotype-phenotype associations. Nature 447, 655–660 (2007).
Cedar, H. & Bergman, Y. Linking DNA methylation and histone modification: patterns and paradigms. Nature Rev. Genet. 10, 295–304 (2009).
Palacios, D., Summerbell, D., Rigby, P. W. & Boyes, J. Interplay between DNA methylation and transcription factor availability: implications for developmental activation of the mouse Myogenin gene. Mol. Cell. Biol. 30, 3805–3815 (2010).
Sawyers, C. L. The cancer biomarker problem. Nature 452, 548–552 (2008).
Grutzmann, R. et al. Sensitive detection of colorectal cancer in peripheral blood by septin 9 DNA methylation assay. PLoS ONE 3, e3759 (2008).
Payne, S. R. From discovery to the clinic: the novel DNA methylation biomarker mSETP9 for the detection of colrectoal cancer in blood. Epigenomics 2, 575–585 (2010).
Khleif, S. N., Doroshow, J. H. & Hait, W. N. AACR-FDA-NCI Cancer Biomarkers Collaborative consensus report: advancing the use of biomarkers in cancer drug development. Clin. Cancer Res. 16, 3299–3318 (2010).
Poste, G. Bring on the biomarkers. Nature 469, 2 (2011).
Bell, C. G. et al. Integrated genetic and epigenetic analysis identifies haplotype-specific methylation in the FTO type 2 diabetes and obesity susceptibility locus. PLoS ONE 5, e14040 (2010).
Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nature Biotech. 28, 817–825 (2010).
Altshuler, D. et al. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Frazer, K. A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Eckhardt, F. et al. DNA methylation profiling of human chromosomes 6, 20 and 22. Nature Genet. 38, 1378–1385 (2006). The first study to show that DNA methylation is correlated in blocks of up to 1kb. This finding enables the design of cost-effective EWASs with comprehensive genome coverage.
Beck, S. & Rakyan, V. K. The methylome: approaches for global DNA methylation profiling. Trends Genet. 24, 231–237 (2008).
Li, N. et al. Whole genome DNA methylation analysis based on high throughput sequencing technology. Methods 52, 203–212 (2010).
Robinson, M. D., Statham, A. L., Speed, T. P. & Clark, S. J. Protocol matters: which methylome are you actually studying? Epigenomics 2, 587–598 (2010).
Irizarry, R. A. et al. Comprehensive high-throughput arrays for relative methylation (CHARM). Genome Res. 18, 780–790 (2008).
Bibikova, M. et al. Genome-wide DNA methylation profiling using Infinium assay. Epigenomics 1, 177–200 (2009).
Suzuki, M. & Greally, J. M. DNA methylation profiling using HpaII tiny fragment enrichment by ligation-mediated PCR (HELP). Methods 52, 218–222 (2010).
Brinkman, A. B. et al. Whole-genome DNA methylation profiling using MethylCap-seq. Methods 52, 232–236 (2010).
Rauch, T. A. & Pfeifer, G. P. DNA methylation profiling using the methylated-CpG island recovery assay (MIRA). Methods 52, 213–217 (2010).
Serre, D., Lee, B. H. & Ting, A. H. MBD-isolated genome sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome. Nucleic Acids Res. 38, 391–399 (2010).
Mohn, F., Weber, M., Schubeler, D. & Roloff, T. C. Methylated DNA immunoprecipitation (MeDIP). Methods Mol. Biol. 507, 55–64 (2009).
Down, T. A. et al. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nature Biotech. 26, 779–785 (2008).
Gu, H. et al. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nature Protoc. 6, 468–481 (2011).
Cokus, S. J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215–219 (2008).
Lister, R. et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008).
Huang, Y. et al. The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS ONE 5, e8888 (2010).
Butcher, L. M. & Beck, S. AutoMeDIP-seq: a high-throughput, whole genome, DNA methylation assay. Methods 52, 223–231 (2010).
Clarke, J. et al. Continuous base identification for single-molecule nanopore DNA sequencing. Nature Nanotechnol. 4, 265–270 (2009).
Flusberg, B. A. et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nature Methods 7, 461–465 (2010).
Pembrey, M. E. et al. Sex-specific, male-line transgenerational responses in humans. Eur. J. Hum. Genet. 14, 159–166 (2006).
S.B. was supported by the Wellcome Trust (084071) and a Royal Society Wolfson Research Merit Award.
The authors declare no competing financial interests.
- Genome-wide association studies
(GWASs). These are genome-wide studies that are designed to identify genetic associations with an observable trait, disease or condition, such as diabetes.
The part of a genome that encodes exons for translation into proteins.
The complete collection of epigenetic marks, such as DNA methylation and histone modifications, and other molecules that can transmit epigenetic information, such as non-coding RNAs, that exist in a cell at any given point in time.
- Core histones
The proteins that form the nucleosome, which is composed of two copies each of the histones 2A, 2B, 3 and 4. Together, they form a histone octamer around which 147 bases of genomic DNA are wrapped.
- Core promoters
Regions upstream and downstream of the transcriptional start site (TSS), typically defined as the interval −60 to +40 bases from the TSS.
- CpG islands
(CGIs). Regions of the genome (typically 500 bp–2 kb) that contain a higher than expected frequency of CpG sites. CGIs are frequently unmethylated and found near promoter regions.
This term refers to genes that are expressed in a parent-of-origin-specific manner.
(LOI). Parental imprinting results in the epigenetic silencing of one allele of a gene owing to its parental origin. Aberrant disruption of imprinting leads to both alleles being expressed; that is, loss-of-imprinting.
- Satellite DNA
A type of non-coding, repetitive DNA that is a component of functional centromeres and the main structural constituent of heterochromatin.
- Methylation quantitative trait loci
(methQTLs). DNA variants that influence the DNA methylation state either in cis or in trans.
- Allele-specific methylation
(ASM). The presence of DNA methylation on only one of the two alleles present in a cell. This could be due to parental imprinting, random methylation of one allele or genetic effects.
- Reverse causation
Refers to an association between A and B that is due to B causing A rather than the presumed A causing B.
- Methylation-sensitive restriction enzyme digestion
Procedure that cleaves dsDNA depending on the methylation status of the enzyme's recognition site. Some enzymes only cleave when the recognition site is methylated and others only when the site is unmethylated.
- Affinity enrichment
In this context, this term refers to a procedure to enrich methylated DNA fragments from a pool of methylated and unmethylated fragments using affinity reagents such as antibodies against 5-methylcytosine or other methyl-binding proteins.
(Reduced representation bisulphite sequencing). A procedure for single base resolution methylation analysis using bisulphite DNA sequencing of a representative part of a genome, typically 5–10%.
The two main statistical schools are the classical (or frequentist) school, which dominated twentieth century science and measures the strength of evidence against a hypothesis using P values, and the Bayesian school, which was developed in the nineteenth century but is currently undergoing a resurgence and attempts to compute the posterior probability that the hypothesis is true.
- Principal coordinates
Analysis of principal coordinates is a multivariate statistical technique that is related to principal components analysis but investigates individuals rather than variables. It is often used to investigate population structure in a sample of individuals whose relatedness has been estimated from genome-wide genotype data.
(Chromatin immunoprecipitation followed by sequencing). A method for mapping the distribution of histone modifications and chromatin-associated proteins genome wide that relies on immunoprecipitation with antibodies to modified histones or other chromatin proteins. The enriched DNA is sequenced to create genome-wide profiles.
A property of chromatin that contains both activating and repressing epigenetic modifications at the same locus.
- Multivariate hidden Markov analysis
A statistical method for modelling multidimensional data by one of a small number of hidden Markov states, each of which is associated with a multivariate probability distribution.
About this article
Cite this article
Rakyan, V., Down, T., Balding, D. et al. Epigenome-wide association studies for common human diseases. Nat Rev Genet 12, 529–541 (2011). https://doi.org/10.1038/nrg3000
Nature Reviews Disease Primers (2022)