Epigenomic and transcriptomic profiling in Alzheimer’s disease

Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease.

Gjoneska, E. et al.Nature 10.1038/nature14252

For transcriptome analysis, we used RNA sequencing to quantify gene expression changes for 13,836 ENSEMBL genes (see Methods; Extended Data Fig. 1a; Supplementary Table S1). We found 2,815 up-regulated genes and 2,310 down-regulated genes in the CK-p25 AD-mouse model as compared to CK littermate controls (at q<0.01; Supplementary Table S1), which we classified into transient (2 weeks only), late-onset (6 weeks only), and consistent (both) (Fig. 1a; Extended Data Fig. 4a, Supplementary Table S1). These showed distinct functional enrichments (Fig. 1a; Supplementary Table S2), with transient-increase genes enriched in cell cycle functions (p<10-92), consistent-increase genes enriched in immune (p<10-10) and stimulus response functions (p<10-4), and consistent- and late-decrease genes enriched in synaptic and learning functions (p<10-12).

These coordinated neuronal and immune changes are consistent with the pathophysiology of AD2 and likely reflect both cell type-specific expression changes as well as changes in cell composition. Indeed, comparison with expression in microglia8 (the resident immune cells of the brain) shows that both, cell type composition (p=2.7E-4) and microglia-specific activation (p=2.9E-6) significantly contribute to the gene expression changes (see Methods). Additionally, quantitative RT-PCR of increased-level genes in purified CD11b+ CD45Low microglia populations confirms cell-type specific activation for 5 of the 7 microglia-specific genes tested (Extended Data Fig. 2).

[...]

We quantified epigenomic changes in promoter regions using relative differences in H3K4me3 levels resulting in 3,667 increased-level and 5,056 decreased-level peaks (q<0.01, Extended Data Fig. 4b; Supplementary Table S3), which we classified into transient, consistent, and late-stage, as for gene expression changes. For enhancer regions, we used relative levels of H3K27ac, resulting in 2,456 increased-level and 2,154 decreased-level peaks (Extended Data Fig. 4c; Supplementary Table S3). Only a very small number of peaks showed differences in Polycomb-repressed and heterochromatin regions, leading us to focus on enhancer and promoter changes for the remaining analyses (Extended Data Fig. 4d,e; Supplementary Table S3).

Genes flanking increased- and decreased-level regulatory regions (see Methods) showed consistent gene expression changes for both promoter and enhancers regions (Extended Data Fig. 5), and were consistently enriched in immune and stimulus-response functions for increased-level enhancers and promoters, and in synapse and learning-associated functions for deceased-level enhancers and promoters (Fig. 1d,e), consistent with our gene ontology results of changing gene expression levels.

Methylome profiling in Alzheimer’s disease

Methylomic profiling implicates cortical deregulation of ANK1 in Alzheimer's disease.

Lunnon, K. et al.Nature Neuroscience 10.1038/nn.3782

We performed a cross-tissue analysis of methylomic variation in AD using samples from four independent human post-mortem brain cohorts.

[…]

For the first (discovery) stage of our analysis, we used multiple tissues from donors (n = 122) archived in the MRC London Brainbank for Neurodegenerative Disease. From each donor, we isolated genomic DNA from four brain regions (entorhinal cortex (EC), n = 104; superior temporal gyrus (STC), n = 113; prefrontal cortex (PFC), n = 110; CER, n = 108) and, where available, from whole blood obtained pre-mortem (n = 57). Our analyses focused on identifying differentially methylated positions (DMPs) associated with Braak staging, a standardized measure of neurofibrillary tangle burden determined at autopsy12, with all analyses controlling for age and sex.

We first assessed DNA methylation differences identified in the EC, given that it is a primary and early site of neuropathology in AD5. Two of the top-ranked EC DMPs (cg11823178, the top-ranked EC DMP, and cg05066959, the fourth-ranked EC DMP) were located 91 bp away from each other in the ankyrin 1 (ANK1) gene on chromosome 8, which encodes a brain-expressed protein13 involved in compartmentalization of the neuronal plasma membrane14 (Fig. 1a). These DMPs are also located proximal to the NKX6-3 gene, encoding a homeodomain transcription factor involved in the development of the brain15, 16. Increased EC DNA methylation at both CpG sites was associated with Braak stage (cg11823178: r = 0.47, t102= 5.39, nominal P = 4.59 × 10−7; cg05066959: r = 0.41, t102 = 5.37, nominal P = 1.34 × 10−5;Fig. 1b). Hypermethylation at both DMPs was significantly associated with Braak score in the STG (cg11823178: r = 0.37, t111 = 4.15, nominal P = 6.51 × 10−5; cg05066959: r = 0.33, t111 = 3.67, nominal P = 3.78 × 10−4) and the PFC (cg11823178: r = 0.29, t108 = 3.12, nominal P = 2.33 × 10−3; cg05066959: r = 0.32, t108= 3.52, nominal P = 6.48 × 10−4) (Fig. 1c). In contrast, no significant neuropathology-associated hypermethylation was detected at either CpG site in the CER (cg11823178: r = 0.01, t106 = 0.082, nominal P = 0.935; cg05066959: r = −0.08, t106 = 0.085, nominal P = 0.395) (Fig. 1d), a region largely protected from neurodegeneration in AD, nor was elevated DNA methylation at either site associated with AD diagnosis in whole blood collected pre-mortem (data not shown).

Notably, we observe a significant overlap in Braak-associated DMPs across the three cortical regions profiled in the London discovery cohort: 38 (permuted P < 0.005) and 30 (permuted P<0.005) of the 100 top-ranked EC probes were significantly differentially methylated in the same direction in the STG and PFC, respectively (Supplementary Table 8), with a highly significant correlation of top-ranked Braak-associated DNA methylation scores across these sites (EC versus STG: r = 0.88, P = 6.73 × 10−14; EC versus PFC: r = 0.83, P = 8.77 × 10−13). There was, however, a clear distinction between cortical regions and CER, with the top-ranked CER DMPs appearing to be more tissue specific and not differentially methylated in cortical regions (permuted P values for enrichment all > 0.05), although ~15% of the top-ranked cortical DMPs were differentially methylated in CER (permuted P values ≤ 0.01), indicating that these represent relatively pervasive AD-associated changes that are observed across multiple tissues. We subsequently used a meta-analysis method to highlight consistent Braak-associated DNA methylation differences across all three cortical regions in the discovery cohort. The top-ranked cross-cortex DMPs are shown in Table 2 and Supplementary Table 9, and DMRs identified using comb-p are listed in Supplementary Table 10. Of note, cg11823178 was the most significant cross-cortex DMP (Δ = 3.20, Fisher's P = 3.42 × 10−11, Brown's P = 1.00 × 10−6), with cg05066959 again being ranked fourth (Δ = 4.26, Fisher's P = 1.24 × 10−9, Brown's P = 6.24 × 10−6; Fig. 1e) and a DMR spanning these probes being associated with neuropathology (Sidak-corrected P = 3.39 × 10−4) (Supplementary Table 10). Together, these data suggest that cortical DNA hypermethylation at the ANK1 locus is robustly associated with AD-related neuropathology.

A second (replication) cortical data set was generated using DNA isolated from two regions (STG and PFC) obtained from a cohort of brains archived in the Mount Sinai Alzheimer's Disease and Schizophrenia Brain Bank (n = 144), with detailed neuropathology data including Braak staging and amyloid burden (Online Methods)19. Notably, Braak-associated DNA methylation scores for the 100 top-ranked cross-cortex DMPs identified in the London discovery cohort (Supplementary Table 9) were strongly correlated with neuropathology-associated differences at the same probes in both cortical regions profiled in the Mount Sinai replication cohort (STG Braak score: r = 0.63, P= 2.66 × 10−12; PFC Braak score: r = 0.64, P = 6.03 × 10−13; STG amyloid burden: r = 0.46, P = 1.09 × 10−6; PFC amyloid burden: r = 0.65, P = 2.87 × 10−13; Fig. 2a). Furthermore, increased DNA methylation at each of the two ANK1 CpG sites was significantly associated with elevated Braak score (Table 1 and Fig. 2b) in the STG (cg11823178: r = 0.28, t142 = 3.62, nominal P = 1.63 × 10−4; cg05066959: r = 0.25, t142 = 3.29, nominal P = 5.78 × 10−4) and PFC (cg11823178:r = 0.24, t140 = 3.14, nominal P = 1.07 × 10−3; cg05066959: r = 0.21, t140 = 2.75, nominal P = 4.00 × 10−3), and also amyloid pathology (Fig. 2c) in the STG (cg11823178: r = 0.21, t142 = 2.81, nominal P = 4.99 × 10−4; cg05066959: r = 0.27, t142 = 3.47, nominal P =5.65 × 10−4) and PFC (cg11823178: r = 0.29, t140 = 3.69, nominal P = 2.35 × 10−4; cg05066959: r = 0.19, t140 = 2.56, nominal P = 9.93 × 10−3).

To further confirm the association between cortical ANK1 hypermethylation and neuropathology, we used bisulfite-pyrosequencing to quantify DNA methylation across an extended region spanning eight CpG sites, including cg11823178 and cg05066959, in DNA extracted from a third independent collection of matched EC, STG and PFC tissue (n = 62) obtained from the Thomas Willis Oxford Brain Collection20 (Online Methods and Supplementary Table 11a). Average DNA methylation across this region was significantly elevated in all three cortical regions tested (EC, P= 0.0004; STG, P = 0.0008; PFC, P = 0.014) in affected individuals (Supplementary Fig. 1), most notably in the EC, where six of the eight CpG sites assessed were characterized by significant AD-associated hypermethylation (Fig. 2d). A meta-analysis of cg11823178 and cg05066959 across all three independent cohorts confirmed consistent neuropathology-associated hypermethylation in each of the cortical regions assessed (Fig. 2e,f). Further evidence to support our conclusions came from an independent EWAS of AD pathology in 708 cortical samples (De Jager et al.)21. There was a significant correlation (r = 0.57, P = 1.55 × 10−9) between the 100 top-ranked DNA methylation changes identified in our cross-cortex analyses and neuropathology-associated differences at the same probes in the study by De Jager et al. (Fig. 2g)21. Conversely, neuropathology-associated DNA methylation scores for top-ranked DMPs in De Jager et al.21were strongly correlated (r = 0.49, P = 7.8 × 10−10) with those that we observed using the cross-cortex model for the same probes in our discovery cohort (Supplementary Fig. 2). In particular, De Jager et al.21 also identified a highly significant association between elevated DNA methylation at cg11823178 and cg05066959 and AD-related neuropathology. Together, these data provide compelling evidence for an association between ANK1 hypermethylation and the neuropathological features of AD, specifically in the cortical regions associated with disease manifestation. Although not previously implicated in dementia, genetic variation in ANK1 is associated with diabetic phenotypes22, 23, 24, an interesting observation given the established links between type 2 diabetes and AD25.

We identified evidence for cortex-specific hypermethylation at CpG sites in the ANK1 gene associated with AD neuropathology. Definitively distinguishing cause from effect in epigenetic epidemiology is difficult, especially for disorders such as AD that manifest in inaccessible tissues such as the brain and are not amenable to longitudinal study9, 10. However, our observation of highly consistent changes across multiple regions of the cortex in several independent sample cohorts suggests that the identified loci are directly relevant to the pathogenesis of AD. In this regard, the ANK1 DMR reported here, and subsequently confirmed by De Jager et al.21, represents one of the most robust molecular associations with AD yet identified.

Inter-individual variation in the human brain’s epigenome and its relation to Alzheimer’s disease

Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci.

De Jager, P. L. et al.Nature Neuroscience 10.1038/nn.3786

Our analytic strategy involves three stages, which are illustrated in Figure 1: Stage 1 is our DNA methylation screen for chromosomal regions in which methylation levels correlate with AD pathology. Details of the analytic model are presented in the Online Methods section. It is followed by Stage 2 in which we replicate the significantly associated CpGs from Stage 1 in an independent set of subjects. In Stage 3, we attempt to functionally validate the role of the differentially methylated regions that are replicated in Stage 2 using mRNA obtained from AD and non-AD subjects. This strategy accomplishes 2 goals: (a) further confirms the role of a given differentially methylated region by showing that a meaningful biological effect (transcriptional change) relates to the disease and (b) helps to narrow down which of the genes near the differentially methylated CpGs are differentially expressed and may therefore be the target gene(s) in a given region.

In the primary analysis of our cortical methylation profiles (Stage 1), we identified autosomal CpGs whose level of methylation correlates with the burden of neuritic amyloid plaques (NP), a key quantitative measure of Alzheimer’s disease neuropathology. NP burden better captures the state of the brain of a deceased subject since cognitively intact individuals display a range of NP pathology, some of which meet neuropathologic criteria for a diagnosis of AD16,17. Table 1 and Supplementary Table 2 contain the results of the primary genome-wide analysis: 137 CpGs are associated with the burden of NP pathology at a p<1.20x10-7. This threshold of significance accounts for the testing of all 415,848 tested CpGs by imposing a Bonferroni correction on a standard p<0.05. Since the exact number of functionally independent units of methylation in the genome is currently unknown, we have chosen this simple but conservative strategy to account for the testing of multiple hypotheses and correct for the testing of each CpG that was measured. Since the proportion of neurons found in each sample was not related to AD (p=0.08), we did not include this as a term in the primary analysis. However, to focus only on the most conservatively associated CpGs, we performed a secondary analysis that includes the variable that captures the proportion of neurons as well as surrogate variables that capture structure in the methylation data that do not correlate with known confounders and may capture cryptic technical or other artifacts. Of the 137 CpGs discovered in the primary analysis, 71 CpGs remain significant in the more conservative secondary analysis (Table 1, Supplementary Table 2). Some of these 71 CpGs are found in the same chromosomal segment and are highly correlated in their level of methylation. Altogether, the 71 CpGs are found in 60 discrete differentially methylated regions distributed throughout the genome (Figure 1): in 8 of these 60 regions, up to three neighboring CpGs with correlated levels of methylation emerge as significant in our analysis and probably capture the same effect.

Individually, any one of the significantly associated CpGs (Table 1, Supplementary Table 2) has a modest effect on the brain’s NP burden: on average, each the 71 CpGs explains 5.0% (range 3.7-9.7%) of the variance in NP burden. However, this is greater than the proportion of variance explained by genetic variants associated with AD, with the exception of APOE. For example, in our subjects, the well-validated CR1 susceptibility allele explains just 1% of variance in NP burden18, and all known AD variants and APOE e4 account for 13.9% of the variance in NP burden. If we consider all 71 CpGs in one comprehensive model, they explain 28.7% of the variance in NP burden, suggesting that methylation levels of certain genomic regions is correlated and that cortical DNA methylation of a large number of discrete regions is strongly correlated with a key measure of AD neuropathology.

Notably, two of the 71 significantly associated CpGs (Supplementary Table 2) are found in loci that harbor known AD susceptibility alleles: cg22883290 in the BIN1 locus (beta=4.44, p=9.00x10-8) and cg02308560 in the ABCA7 locus (beta=3.62, p=2.45x10-12)19-22. cg22883290, is located 5 kb from the 5’ end of the BIN1 gene and 92 kb from the index SNP, rs744373, that best captures the genetic association to AD in this region (Figure 2c)19. The susceptibility variant rs744373 is moderately associated with the level of methylation at cg22883290 (p=0.0003). However, the CpG association with AD pathology is not driven by the variant: adjusting for rs744373 does not meaningfully change the effect size of the CpG association to NP burden (model with rs744373 as a covariate: beta=4.37, p=4.91x10-7). Within our dataset of modest size, rs744373 is not associated with AD susceptibility, and we therefore cannot formally test for mediation of the SNP’s association to disease by CpG methylation. In the case of ABCA7, the index SNP (rs3764650) is associated with NP burden23 but has no association (p=0.07) with the level of methylation at cg02308560 which is 25 kb away, so, in both of these regions, SNPs and CpGs appear to have independent effects on AD susceptibility. Overall, risk of AD may therefore be affected by different sources of genomic variation (genetic and epigenetic) that have independent effects on the disease process.

To facilitate the interpretation of our results, we performed a secondary analysis correlating the level of methylation at these 71 CpGs with a post-mortem, neuropathologic diagnosis of AD. 22 of the NP-associated CpGs are also associated with a diagnosis of AD at a genome-wide level of significance (Table 1 and Supplementary Table 2), and all of the CpGs associated with NP burden display at least some evidence of association (p<0.001) with AD. This is not surprising since NP burden is one criterion for a neuropathologic AD diagnosis. We note an interesting polarization in the direction of these associations: 82% of the differentially methylated regions are more methylated in subjects with a diagnosis of AD. As noted above, the increased level of methylation in relation to AD at any one associated probe is modest (Figure 2a and 2b).

Cognitively non-impaired subjects display the same alterations in methylation

Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci.

De Jager, P. L. et al.Nature Neuroscience 10.1038/nn.3786

To begin to explore the question of whether the increased level of DNA methylation in the associated regions is a cause or an effect of the neurodegenerative process of AD, we limited the NP analysis to those subjects who were deemed to be cognitively non-impaired at the time of death (no AD and no mild cognitive impairment). As has been well documented in neuropathological and imaging studies25,26, a large fraction of non-impaired, older individuals demonstrate accumulation of amyloid pathology that is asymptomatic. Within the subset of non-impaired subjects, the p value for the CpG associations is diminished given the reduced sample size (n=237), but the beta values, which capture the magnitude of the association’s effect, are not significantly different from the beta values calculated from the entire sample collection (Supplementary Table 3). This suggests that the altered DNA methylation that we have identified in our discovery study is an early feature of AD pathology and occurs in the presymptomatic stage of the disease. These DNA methylation changes are therefore not secondary to the later stages of the dementing process. The question of whether altered DNA methylation contributes to the pathologic process or is an early epiphenomenon of the neurodegenerative process remains open.

Integrating our results with known Alzheimer’s disease genes

Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci.

De Jager, P. L. et al.Nature Neuroscience 10.1038/nn.3786

To further evaluate the role of these eight genes in relation to well-validated AD genes, we used the DAPPLE algorithm to evaluate the connectivity of these genes with the network of known AD susceptibility genes. We have previously used this method that requires co-expression of interacting protein pairs and adjusts for gene size, and we reported the existence of an AD susceptibility network derived from protein:protein interaction data13. Here, we use an updated model that includes the latest results from genome-wide association studies and the studies of rare variation. First, we find that the network of susceptibility genes from genome-wide association studies and mendelian AD genes is significant both in terms of direct connectivity (p=0.0072) and indirect connectivity (proportion of susceptibility genes sharing a common interactor, p=0.037)(Supplementary Figure 6). We then repeated the analysis after adding the eight genes found in the validated differentially methylated regions that also display altered RNA expression in AD. As seen in Figure 4, several of the differentially expressed genes found in the differentially methylated regions - ANK1, DIP2A, RHBDF2, RPL13, SERPINF1 and SERPINF2 - connect to the AD susceptibility network derived from genetic studies. The direct (p=0.0072) and indirect (p=0.042) network connectivity remain significant in the iteration of the network analysis that includes the eight genes with altered RNA expression levels.

Effects of fetal growth on DNA methylation

Sexual dimorphism in epigenomic responses of stem cells to extreme fetal growth.

Delahaye, F. et al.Nature Communications 10.1038/ncomms6187

We performed genome-wide DNA methylation profiling on purified CD34+ HSPCs from 60 subjects, 20 in each of three groups defined by appropriate or excessively large or small birth weight and ponderal index for gestational age and sex. The HELP-tagging assay was used as a survey technique testing ~1.8 million loci quantitatively at nucleotide resolution and including relatively CG dinucleotide depleted loci. After quality control measures, 993,514 loci were selected for further analyses. Of these, 10,043 loci were defined as candidate differentially-methylated loci using batch adjusted significance and degree of methylation difference thresholds in comparisons of IUGR and LGA infants compared with the normal birth weight controls. We observed a global relative shift towards DNA hypermethylation in CD34+ HSPCs in both IUGR and LGA subjects when compared with the controls (Fig. 1). While there exists a subset of loci altered in association with both IUGR and with LGA, the majority of individual loci are distinctive between these groups (Fig. 1c,d).

Sexual dimorphism associated with the extremes of fetal growth

Sexual dimorphism in epigenomic responses of stem cells to extreme fetal growth.

Delahaye, F. et al.Nature Communications 10.1038/ncomms6187

Sex-specific comparisons for DNA methylation patterns are shown between control and IUGR and LGA subjects (Fig. 2). Both IUGR males and females show a shift in DNA methylation profiles compared to controls, but the number of hypermethylated loci is markedly higher in males compared to females (Fig. 2a). Sex-specific differences are also seen in the comparison of LGA to controls, with LGA females showing an increase in the overall number of candidate differentially-methylated loci compared to males (Fig. 2b). These findings indicate a sexual dimorphism in the epigenetic responses of HSPCs to the extremes of growth conditions in utero.

Targeting of DNA methylation changes to specific genomic contexts

Sexual dimorphism in epigenomic responses of stem cells to extreme fetal growth.

Delahaye, F. et al.Nature Communications 10.1038/ncomms6187

While the consequences of DNA methylation changes at recognized promoter sequences are generally predictable, a genome-wide study of this type can generate a majority of findings in un-annotated genomic locations. To predict the functional consequences of these candidate differentially-methylated loci, we took advantage of the mapping of chromatin components in CD34+ HSPCs performed as part of the Roadmap Epigenomics Program. The details of this annotation are described in a separate report (Wijetunga, Delahaye et al., manuscript in review), but involved the use of the Segway algorithm to generate genomic features that were then interpreted using Self-Organizing Maps. We were thus able to define candidate promoters, enhancers, transcribed sequences and repressive chromatin in the epigenome specific to the CD34+ HSPC population. Every HpaII site could then be assigned to a candidate feature based on its genomic position. The HELP-tagging assay represents each of the candidate genomic features (based on 993,514 loci) and the candidate differentially-methylated loci (10,043) are significantly enriched in Segway features 4 (enhancers, p<0.001) and 6 (promoters, p<0.001), indicating preferential targeting to transcriptional regulatory elements (Fig. 3a). We show an example of the mapping of the one the candidate differentially-methylated loci, to the promoter of the Retinoid X receptor, alpha (RXRA) gene, at an annotated CpG island, and within the Segway feature 6 annotation indicating candidate promoter function. The HELP-tagging derived Methylation Scores for Cases (IUGR and LGA combined) are compared with controls to demonstrate the magnitude of the change at this locus (Fig. 3b).

Age- and expression-related methylation linked to antigen processing and presentation genes

Age-related variations in the methylome associated with gene expression in human monocytes and T cells.

Reynolds, L. M. et al.Nature Communications 10.1038/ncomms6366

We consider age-eMS (age- and expression-associated methylation sites) overlapping enhancers, insulators, or promoters (669 age-eMS) as top candidates for potentially functional age-dMS (Supplementary Data 2). These 669 CpG sites overlapping potentially functional regulatory regions are associated with the expression of 403 different genes, which are significantly enriched 26 with antigen processing and presentation genes (GO:0019882, FDR = 9.60x10-4). Table 1 shows associations between age, methylation, and expression of 13 antigen processing and presentation genes including major histocompatibility complex (MHC) class I and II genes. Supporting previous findings 27, we observe an up-regulation of all MHC class I and II genes with expression associated with age (FDR < 0.01) (Supplementary Table 2).

Age- and expression-related methylation linked to vascular ageing

Age-related variations in the methylome associated with gene expression in human monocytes and T cells.

Reynolds, L. M. et al.Nature Communications 10.1038/ncomms6366

To further explore the biological relevance of age-eMS, we identified a subset of 186 age-eMS that were associated with vascular age (FDR<0.001), measured by pulse pressure (Supplementary Data 2). After adjusting for chronological age, 42 age-eMS remained nominally associated with pulse pressure (pulse pressure-eMS), some of which were associated with expression of genes that have biologically plausible roles in vascular aging, such as ARID5B (AT rich interactive domain 5B (MRF1-like)), and GSN (gelsolin). ARID5B is a transcription factor that has been implicated in the pathogenesis of coronary artery disease 28. Gelsolin is an actin binding protein which has been linked to vascular permeability 29 , cell motility, and the development of many pathological processes, including cardiovascular diseases 30.

T-cell-specific enhancers

Epigenomic analysis of primary human T cells reveals enhancers associated with TH2 memory cell differentiation and asthma susceptibility.

Seumois, G. et al.Nature Immunology 10.1038/ni.2937

A characteristic feature of asthma is the aberrant accumulation, differentiation or function of memory CD4+ T cells that produce type 2 cytokines (TH2 cells). By mapping genome-wide histone modification profiles for subsets of T cells isolated from peripheral blood of healthy and asthmatic individuals, we identified enhancers with known and potential roles in the normal differentiation of human TH1 cells and TH2 cells (Figure 2). We discovered disease-specific enhancers in T cells that differ between healthy and asthmatic individuals (Figure 7). Enhancers that gained the histone H3 Lys4 dimethyl (H3K4me2) mark during TH2 cell development showed the highest enrichment for asthma-associated single nucleotide polymorphisms (SNPs), which supported a pathogenic role for TH2 cells in asthma (Figure 6). In silico analysis of cell-specific enhancers revealed transcription factors, microRNAs and genes potentially linked to human TH2 cell differentiation. Our results establish the feasibility and utility of enhancer profiling in well-defined populations of specialized cell types involved in disease pathogenesis.

Figure 1: Differential gene expression and histone mark levels at regulatory regions in CK-p25 mice.
figure 1

Shown are 6 distinct classes of differentially modified regions: transient (early) increase (pink) or decrease (light blue), consistent increase (red) or decrease (blue), and late (6 wk) increase (dark red) or decrease (navy blue). The heatmap shows the log fold change relative to 2 week controls for a, gene expression; b, H3K4me3 peaks at “TSS” chromatin state; c, H3K27ac peaks at enhancer chromatin state; d, H3K27me3 peaks overlapping the Polycomb repressed chromatin state; e, H3K9me3 peaks overlapping the heterochromatin chromatin state. Numbers denote peaks falling into each category

Figure 2: Differential microglia-specific gene expression changes in the CK-p25 mice.
figure 2

Quantitative RT-PCR of selected microglia markers and immune response genes shows upregulation of gene expression in FAC-sorted CD11b+ CD45Low microglia from 2 week induced CK-p25 mice (red bars) relative to respective controls (black bars). ACTB (b-actin) was used as a negative control. Values were normalized to CD11b expression (n=3, Two-tailed t-test *p<0.05); ns, non-significant.

Figure 3: Relationship between gene changes of gene expression and regulatory regions in CK-p25 mice.
figure 3

For each class of gene expression change in the CK-p25 model (x axis), enrichment to overlap different histone modifications is shown (y axis) for a, H3K4me3 at promoters; b, H3K27ac at enhancers; c, H3K27me3 at Polycomb repressed regions. Histone modifications were mapped to the nearest transcription start site (Supplementary Table S3) to show the enrichment of the changing regulatory regions relative to those that are stable in CK-p25. The significance is calculated based on the hypergeometric p-value of the overlap.

Figure 4: Cortex-specific hypermethylation of ANK1 is correlated with AD-associated neuropathology in the brain.
figure 4

(a) Linear regression models demonstrated that cg11823178 in ANK1 was the top-ranked neuropathology-associated DMP in the EC in the London discovery cohort (n = 104). The adjacent probe, cg05066959, was also significantly associated with neuropathology. Green bars (bottom row) denote the location of annotated CpG islands. (b) EC DNA methylation at both CpG sites was strongly associated with Braak score (cg11823178: r = 0.47, t102 = 5.39, P = 4.59 × 10−7; cg05066959: r = 0.41, t102 = 5.37, P = 1.34 × 10−5). (c) Both probes were also associated with neuropathology in the other cortical regions assessed in the same individuals, being significantly correlated with Braak score in the STG (n = 113) (cg11823178: r= 0.37, t111 = 4.15, P = 6.51 × 10−5; cg05066959: r = 0.33, t111 = 3.67, P = 3.78 × 10−4) and the PFC (n= 110) (cg11823178: r = 0.29, t108 = 3.12, P = 2.33 × 10−3; cg05066959: r = 0.32, t108 = 3.52, P = 6.48 × 10−4). (d) There was no association between DNA methylation and Braak score at either ANK1 probe in the CER (n = 108) (cg11823178: r = 0.01, t106 = 0.082, P = 0.935; cg05066959: r = −0.08, t106 = 0.085, P = 0.395), a region largely protected against AD-related neuropathology. (e) cg11823178 was the top-ranked cross-cortex DMP (Fisher's χ2(6) = 60.6, P = 3.42 × 10−11), with cg05066959 also being strongly associated with Braak score (Fisher's χ2(6) = 52.9, P = 1.24 × 10−9).

Figure 5: Neuropathology-associated DMPs are consistent across sample cohorts, with replicated evidence for ANK1 hypermethylation.
figure 5

(a) Braak-associated DNA methylation scores for the top-ranked cross-cortex DMPs identified using linear regression models in the London discovery cohort (Supplementary Table 9) were significantly correlated with neuropathology-associated differences at the same probes in both cortical regions profiled in the Mount Sinai replication cohort using linear regression models (PFC (n = 142) Braak score: r = 0.64, P = 6.03 × 10−13; STG (n = 144) Braak score: r = 0.63, P = 2.66 × 10−12; PFC amyloid burden: r = 0.65, P = 2.87 × 10−13; STG amyloid burden: r = 0.46, P = 1.09 × 10−6). Shown is data for Mount Sinai PFC Braak score analysis, with the two ANK1 probes (cg11823178 and cg05066959) highlighted in red. (b,c) cg11823178 and cg05066959 were significantly associated with Braak score in the STG (cg11823178: r = 0.28, t142 = 3.62, P = 1.63 × 10−4; cg05066959: r = 0.25, t142 = 3.29, P = 5.78 × 10−4) and PFC (cg11823178: r = 0.24, t140 = 3.14, P = 1.07 × 10−3; cg05066959: r = 0.21, t140 = 2.75, P = 4.00 × 10−3) (b), and amyloid pathology in the STG (cg11823178: r = 0.21, t142 = 2.81, P = 4.99 × 10−4; cg05066959:r = 0.27, t142 = 3.47, P = 5.65 × 10−4) and PFC (cg11823178: r = 0.29, t140 = 3.69, P = 2.35 × 10−4; cg05066959: r = 0.19, t140 = 2.56, P = 9.93 × 10−3) (c). (d) In the Oxford replication cohort, bisulfite-pyrosequencing was used to quantify DNA methylation across eight CpG sites spanning an extendedANK1 region. Linear models, adjusting for age and gender, confirmed significant neuropathology-associated hypermethylation in all three of the cortical regions assessed (Supplementary Fig. 1), most notably in the EC (n = 51), where six of the eight CpG sites showed a significant (amplicon average P = 0.0004) neuropathology-associated increase in DNA methylation (data is represented as mean ± s.e.m.,*P < 0.05, **P < 0.01, ***P < 0.005). (e,f) Meta-analyses across the three sample cohorts (London, Mount Sinai and Oxford) confirmed Braak-associated cortex-specific hypermethylation for both cg11823178 (e) and cg05066959 (f). Finally, there was a marked consistency in neuropathology-associated DMPs identified in our discovery cohort and those identified in De Jager et al.21. (g) Braak-associated DNA methylation scores for the 100 top-ranked cross-cortex DMPs identified in the London discovery cohort were significantly correlated with neuropathology-associated differences (neuritic-plaque load) at the same probes in the dorsolateral prefrontal cortex (DLPFC) identified by De Jager et al.21 in 708 individuals (r = 0.57, P = 1.55 × 10−9). The two ANK1 probes (cg11823178 and cg05066959) are highlighted in red.

Figure 6: Summary of the genome-wide brain DNA methylation scan for NP burden and its validation using independent DNA methylation data and brain RNA data
figure 6

Each sector of this diagram presents summary results of the three different analyses in a chromosome. The perimeter of this circular figure presents the physical position along each chromosome (in Mb). The cytogenetic bands of each chromosome are presented in the first circle, with the centromere highlighted in red. The next circle (green) reports the density of CpG probes successfully sampled by the Illumina beadset that are present in a given genomic segment (range, 0–200 probes per 100 kb). The blue circle reports the results of the DNA methylation scan: using a −log(P) scale (range, 0–20), we report the results for each of the 71 associated CpGs found in 60 independent differentially methylated regions (DMR) from the analysis relating DNA methylation levels to NP burden. Similarly, the first red circle reports the −log(P) (range, 0–10) for the 71 CpGs in the replication analysis. The cream-colored circle reports the names of genes found within 50 kb of each associated CpG (light blue letters). The ABCA7 and BIN1regions, which harbor AD susceptibility alleles, are highlighted in red letters. The subset of the genes with differential mRNA expression in AD in the Mayo clinic data set is shown in black. The next red circle reports the results of the association of RNA expression level of these genes to a diagnosis of AD in the Mayo clinic data set (–log(P); range, 0–20). The central circle reports the set of validated CpGs Chromosomes 9 and 18 contain no CpG that meets a threshold of genome-wide significance; thus, to enhance the clarity of the figure for the other chromosomes, these two chromosomes are not included in the figure. Not all genes found in the associated regions are listed in the figure. For clarity, only a subset of genes are selected from loci significant in the discovery analysis.

Figure 7: Extent of differences in methylation levels at associated CpGs and regional distribution of associations
figure 7

(a,b) Two of the most AD-associated probes, from the MCF2L (a) and ANK1 (b) DMRs (Table 1), were selected to illustrate the increase in methylation levels that we observed, on average, with a diagnosis of AD in 82% of the CpGs that met our threshold of significance. Shown is a smoothed histogram presenting the distribution of DNA methylation values at that CpG for subjects that were classified as having a neuropathologic diagnosis of AD (case, red, n = 460) and those subjects that did not meet these diagnostic criteria (control, light green, n = 263). A methylation value of 1.0 indicates that the CpG is completely methylated in these samples. We found that the distribution of AD subjects was statistically significantly different from that of the control subjects. However, the two distributions overlapped, and the absolute difference between the two distributions was modest. (c) Regional association plot around cg22883290 in the BIN1 DMR that has previously been associated with AD susceptibility in genome-wide association studies. Each diamond represents one CpG tested in this region. The horizontal dotted blue line highlights the threshold of significance for this analysis. The vertical blue line reports the density of CpG probes at a given point. The extent to which DNA methylation level at a given CpG correlates with the level of DNA methylation of the best CpG (cg22883290) is reported using the |r|2 value. Finally, above the diagram of the genes found in this DNA segment, the chromatin state of the region is shown, as assessed in healthy, unimpaired older individuals with minimal AD-related pathology. The chromatin state was derived in 200-bp bins. Overall, the BIN1 gene appeared to be in an open, transcribed conformation in healthy, older dorsolateral prefrontal cortex, and the associated CpG appeared to be located in a region just 3′ to the gene, which was largely in a conformation found on the periphery of actively transcribed regions. (d) Regional association plot around the RHBDF2 DMR, centered on cg13076843, which met our threshold of significance. An associated CpG was found in close proximity to two genes, and our RNA analyses suggest that it is RHBDF2 that is the target of the DMR, as its expression was altered relative to AD (Table 2).

Figure 8: Genome-wide DNA methylation profiles
figure 8

(a) Density plots of methylation scores for IUGR or LGA compared with controls. The distributions of DNA methylation scores are shown in red. (b) A self-organizing heatmap of candidate differentially methylated loci showing clustering by sample. (c) Volcano plots of DNA methylation score differences for IUGR compared with control, LGA compared with control and IUGR compared with LGA, based on 993,514 loci throughout the genome. Differentially methylated loci with P value <0.05 and methylation difference >|20| are shown in black. (d) Differentially methylated loci meeting threshold criteria are quantified in a proportional Venn diagram for each comparison.

Figure 9: Genes identified in our DNA methylation screen connect to a network of known AD susceptibility genes.
figure 9

Using protein-protein interaction data, the DAPPLE algorithm evaluated the extent of connectivity among known AD genes (susceptibility and Mendelian genes) and the seven genes found in DMRs that were also differentially expressed relative to AD. Shown are the results of an analysis allowing for one common interactor protein that is not known to be associated with AD. For example, RHBDF2 is displayed at the top of the figure in green and connects to PTK2B, a protein tyrosine kinase genetically associated with AD susceptibility that has a central role in this network. Notably, SERPINF1 and SERPINF2 connect to different elements of the amyloid component of the network (bottom left). Furthermore, DIP2A connects the recently described PLD3 gene that has a rare AD susceptibility allele and to SORL1, a gene with a common AD susceptibility allele, that connects to the amyloid precursor protein (APP). These interconnections are consistent with the reported effects of both PLD3 and SORL1 on amyloid biology and implicate DIP2A in the same process (see also Supplementary Fig. 6). The colored nodes are the proteins encoded by genes implicated in AD (genetic and epigenomic associations); the colors have no meaning. The connecting proteins not known to be associated with AD are shown in gray.

Figure 10: Sexual dimorphism in IUGR males and LGA females for differentially methylated loci.
figure 10

The lower panels show volcano plots of DNA methylation score differences, the upper panels quantify the densities of differentially methylated loci (P value<0.05 using analysis of variance with pairwise two-tailed Tukey-tests, methylation difference >|20|). (a) IUGR compared with controls, (b) LGA compared with controls

Figure 11: Candidate differentially methylated loci are enriched at cis-regulatory elements
figure 11

(a) Based on empirical annotation of promoter, enhancer, repressive and transcribed regions, enrichment of candidate differentially methylated loci (n=10,043) in cases (IUGR and LGA) compared with controls is illustrated with significance values shown for enriched sequence features. The bar on the left represents the proportional representation of each feature in terms of loci tested by HELP-tagging, whereas the bar on the right shows the proportions of features at which differentially methylated loci are found. Significant enrichment for differential methylation at candidate promoters and enhancers is observed. (b) An example of the RXRA gene with a candidate differentially methylated locus is shown. The DNA methylation score differences between controls and IUGR (top), LGA (middle) and cases (bottom, IUGR and LGA combined) are depicted, with a site identified as being a candidate differentially methylated locus in the CpG island promoter region shown in grey. Blue, positive values represent decreased DNA methylation in the cases of extreme fetal growth; yellow, negative value increased methylation.

Table 1 Antigen processing and presentation genes enriched among genes with expression linked to potentially functional age-eMS
Figure 12
figure 12

Age and cis-gene expression associated methylation sites (age-eMS) detected in 1,264 monocyte samples (FDR<0.001)

Figure 13: Changes in enhancer strength among TH cell subsets.
figure 13

(a) 'Minus-average' (MA) plots for genomic regions with differences in H3K4me2 enrichment (DERs) for pairwise comparisons of indicated cell types; total numbers of DERs identified are listed (left; red and orange dots indicate windows with an adjusted P < 0.05 and raw P < 0.005, respectively; exact test for negative binomial distribution, using edgeR integrated in Bioconductor package MEDIPS). Overlap among the DERs identified for each pairwise comparison (right; Supplementary Table 3). (b) Z-scores of normalized read counts for each unique enhancer DER (columns) obtained from any of the three pairwise comparisons of naive, TH2 and TH1 cells; data are shown from each independent ChIP-seq assay (n = 120 total assays) (rows). (c) H3K4me2 enrichment tracks for each cell type were merged from all assays and illustrated along with location of mouse DNase I hypersensitivity sites (HS) and locus control regions (LCR) (red arrows), IL13 and IL4 promoter (IL13p and IL4p; blue arrows), University of California Santa Cruz (UCSC) multispecies conservation tracks, human TH2 cell cytokine locus and cell type–specific enhancer DERs. H3K4me2 enrichment values for specific 500-bp windows (red dashed-line boxes) are shown below. Each dot represents data from a single assay; error bars indicate mean ± s.e.m. (d) Tracks similar to those in c, for IFNG, GATA3 and TBX21. Enhancer DERs that overlap evolutionarily conserved and putative human-specific enhancers are highlighted by red and blue dashed-line boxes, respectively.

Figure 14: Identification of asthma-associated enhancers.
figure 14

(a) MA plots (vertically displayed) illustrate genomic regions with differences in H3K4me2 enrichment (DERs) between healthy and asthmatic subjects in the three different cell types (Supplementary Table 12). Red dots and orange dots indicate windows with adjusted P < 0.05, or with raw P < 0.005, respectively (exact test for negative binomial distribution, using edgeR integrated in Bioconductor package MEDIPS). Z-scores (right) of normalized read counts for each asthma-associated DER (rows) identified in the TH2 cells. (b) Manhattan plot illustrates the genome-wide distribution of asthma-associated DERs in relation to their statistical significance values (P values, MEDIPS; y-axis parameter). Red dashed line sets the threshold for an adjusted P < 0.05. (c) Comparison of H3K4me2 enrichment between healthy and asthmatic subjects in indicated cells. H3K4me2 tracks for each cell type were merged from all assays performed in healthy (HC) and asthmatic (AS) donors (same cohort as shown for the analysis above and in Fig. 2). (d) H3K4me2 enrichment values for each asthma-associated DER (highlighted in purple dashed line boxes in c from the same H3K4me2 ChIP-seq assays shown in Fig. 2). Each dot represents data from an independent assay; n = 18 assays from 10 healthy subjects (HC), n = 24 assays from 12 asthmatic patients (AS); error bars indicate mean ± s.e.m.; *P < 0.05, **P < 0.01, ***P < 0.001 (MEDIPS).

Figure 15: Asthma GWAS SNPs are enriched in TH2 cell enhancers.
figure 15

(a) Enrichment values of asthma GWAS SNPs in TH cell enhancer subgroups (Fig. 2b) and other cell tissue–specific enhancers32 (top) and for SNPs associated with other diseases (bottom; Supplementary Table 11). Enrichment values that did not reach significance (Chi-squared test, Online Methods) are shown in gray. ADMSC, adipose-derived mesenchymal stem cells. (b) Overlap of cell-specific DERs (shown in Fig. 2b) with asthma GWAS SNPs (top) and percentages of overlapping DERs or asthma SNPs in different DER subgroups (bottom). (c) UCSC tracks of IL33–IL18R, IL5–RAD50–IL13–IL4 andRORA loci containing large haplotype blocks of asthma-associated SNPs (black lines indicate their genomic location, red lines are SNPs that overlap DERs), along with cell-specific DERs tracks and H3K4me2 tracks for each cell type (merged from all assays shown in Fig. 2). Graphs show H3K4me2 enrichment values for each asthma-SNP-associated DERs (500-bp regions harboring the asthma SNP; highlighted in purple dashed-line boxes in c) in TH2 cells from the same H3K4me2 ChIP-seq assays shown in Figure 2. Each dot represents data from an independent assay; n = 18 assays from 10 healthy (HC) subjects, n = 24 assays from 12 asthmatic (AS) subjects; error bars indicate mean ± s.e.m.; *raw P< 0.05; **raw P < 0.01; NS, nonsignificant, calculated using MEDIPS.