Article | Published:

Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci

Nature Neuroscience volume 19, pages 4854 (2016) | Download Citation

Abstract

We characterized DNA methylation quantitative trait loci (mQTLs) in a large collection (n = 166) of human fetal brain samples spanning 56–166 d post-conception, identifying >16,000 fetal brain mQTLs. Fetal brain mQTLs were primarily cis-acting, enriched in regulatory chromatin domains and transcription factor binding sites, and showed substantial overlap with genetic variants that were also associated with gene expression in the brain. Using tissue from three distinct regions of the adult brain (prefrontal cortex, striatum and cerebellum), we found that most fetal brain mQTLs were developmentally stable, although a subset was characterized by fetal-specific effects. Fetal brain mQTLs were enriched amongst risk loci identified in a recent large-scale genome-wide association study (GWAS) of schizophrenia, a severe psychiatric disorder with a hypothesized neurodevelopmental component. Finally, we found that mQTLs can be used to refine GWAS loci through the identification of discrete sites of variable fetal brain methylation associated with schizophrenia risk variants.

Main

Human brain development is orchestrated by complex transcriptional programs1, which are guided and reinforced by epigenetic modifications to DNA and histone proteins. DNA methylation is the most extensively studied epigenetic modification, having a key role in many important genomic regulatory processes2. Of note, the establishment and maintenance of cell- and tissue-specific DNA methylation patterns is crucial for normal mammalian development3. Although traditionally regarded as a mechanism of transcriptional repression, DNA methylation can be associated with both increases and decreases in gene expression4, and has recently been implicated in other genomic functions, including alternative splicing and promoter usage5.

We recently characterized widespread changes in DNA methylation across human fetal brain development6, although the factors influencing inter-individual methylomic variation during the prenatal period are unknown. Studies in a variety of tissues, including adult human brain7,8, have shown that DNA methylation can be influenced by DNA sequence variation. These mQTLs have been found to overlap with DNA variants associated with levels of gene expression (expression quantitative trait loci, eQTLs)4,9, and may serve as markers for these as well as other genetic influences on gene regulation. Although mQTLs have been assessed in the adult human brain using low-resolution DNA methylation arrays7,8, mQTLs in the developing human brain have not been explored.

We combined high-density DNA methylation profiling with genome-wide SNP genotyping in a large (n = 166) collection of human brain samples from the first and second trimester of gestation. Given the growing evidence that many common variants associated with complex diseases act through effects on gene regulation10,11, we subsequently tested for enrichment of fetal mQTL among risk loci identified in a recent large-scale GWAS of schizophrenia12, a neuropsychiatric disorder with a hypothesized neurodevelopmental component13,14. Finally, we found that mQTL data can be used to refine broad GWAS loci through the identification of discrete sites of variable fetal brain methylation associated with schizophrenia risk variants. As a resource to the wider community, we have developed a searchable online database of fetal brain mQTLs that can be accessed at http://epigenetics.essex.ac.uk/mQTL/.

Results

mQTLs in the developing human brain are widespread and predominantly characterized by cis effects

We performed genome-wide single-nucleotide polymorphism (SNP) genotyping and DNA methylation profiling in 166 human fetal brain samples ranging from 56–166 d post-conception (Online Methods and Supplementary Table 1). After stringent quality control, we tested for an additive effect of allele dosage on DNA methylation across all potential pairings of 430,304 SNPs and 314,554 DNA methylation sites to identify fetal brain mQTLs (Supplementary Table 1). We identified 16,809 mQTLs at a conservative Bonferroni-corrected significance threshold of P < 3.69 × 10−13 (Supplementary Tables 2 and 3). The median DNA methylation change per allele across all identified mQTLs was 6.69% (interquartile range (IQR) = 3.17–8.96%) for each mQTL SNP (Supplementary Fig. 1), slightly larger than reported in a previous analysis of genome-wide mQTLs in the adult brain (median effect size = 4.11%, IQR = 2.13–6.97%)7. The majority of mQTL SNPs (74.17%) are associated with DNA methylation at only a single probe; in contrast, most DNA methylation sites (69.83%) showing evidence for association do so with multiple mQTL SNPs, presumably as a result of linkage disequilibrium (LD) between SNPs (Supplementary Fig. 2). A searchable database of fetal brain mQTLs is available at http://epigenetics.essex.ac.uk/mQTL/.

The majority of fetal brain mQTLs (96.3%) involve SNPs and DNA methylation sites on the same chromosome (Fig. 1a and Supplementary Table 2). We defined significant SNP-methylation relationships spanning <500 kb as cis-mQTLs (n = 15,942, 94.8% of total), with those spanning >500 kb or being characterized by inter-chromosomal effects (n = 867, 5.16% of total) as representing trans-mQTLs. The strong enrichment of cis-mQTLs concurred with data from other tissues and cell types7,15,16,17,18. Among the cis-mQTLs, both effect size (that is, DNA methylation change per allele; Fig. 1b) and significance (that is, P value; Supplementary Fig. 3) were related to the distance between the Illumina 450K array probe and mQTL SNP.

Figure 1: mQTLs in the developing human brain are predominantly cis-acting with effect size related to distance, although notable trans-effects are also present.
Figure 1

(a) The genomic distribution of Bonferroni significant (P = 3.69 × 10−13) mQTLs in fetal brain, where the position on the x axis indicates the location of Illumina 450K HumanMethylation array probes and the position on the y axis indicates the location of informative SNPs. The color of the point corresponds to the difference in DNA methylation per allele compared with the reference allele, with the largest effects plotted in dark red. A clear positive diagonal can be observed demonstrating that the majority of mQTLs in fetal brain are associated with genotype in cis. (b) The relationship between mQTL effect size (DNA methylation change per allele) and distance between the Illumina 450K HumanMethylation array probe and informative SNPs in fetal brain, confirming the predominantly cis nature of mQTLs. A similar relationship is seen between mQTL significance and distance (Supplementary Fig. 3). (c) Trans-mQTLs in the developing human brain. Shown are all Bonferroni-significant (P = 3.69 × 10−13) trans-mQTLs identified in fetal brain samples. The thickness of each line depicts association effect size, with color reflecting the chromosomal location of the mQTL SNP.

Despite the preponderance of cis-mQTLs, there were some notable trans-mQTL effects (Fig. 1c and Supplementary Table 4), consistent with previous reports of long-range genetic regulation of epigenetic variation in multiple cell types19. Although the average effect size for trans-mQTLs was significantly lower than that observed for cis-mQTLs (two-sided Wilcoxon rank sum test, P = 6.74 × 10−7; Supplementary Fig. 1), there was a higher proportion of larger (DNA methylation change per allele > 25%) effects among trans-mQTLs than cis-mQTLs (1.04% versus 0.715%). Of the 178 DNA methylation sites identified as being associated with trans-acting genetic variation in the fetal brain, 50 (28.09%) and 108 (60.67%) were also identified in studies of pancreatic islet cells16 and lymphocytes19, respectively (Supplementary Tables 4 and 5). These long-range associations between genotype and DNA methylation complement data showing interactions between regulatory elements spanning several Mb20, and even between chromosomes21.

Fetal brain mQTLs are significantly enriched in functional regulatory domains

We used data from ENCODE and the Roadmap Epigenomics Project10,22,23,24 to assess whether sites characterized by genotype-associated DNA methylation colocalize with genomic regions associated with markers of transcriptional activity. We observed an enrichment of fetal brain mQTLs in genomic regions characterized by chromatin immunoprecipitation sequencing (ChIP-seq) peaks for repressive histone modifications in fetal brain, such as H3K9me3 (relative enrichment = 1.43, P = 0.00107) and H3K27me3 (relative enrichment = 1.16, P = 0.000144), and a significant depletion of fetal brain mQTLs in genomic regions defined by ChIP-seq peaks for histone modifications associated with active transcription, such as H3K4me1 (relative enrichment = 0.828, P = 6.55 × 10−8) and H3K36me3 (relative enrichment = 0.543, P = 1.39 × 10−15; Supplementary Table 6). Fetal brain mQTLs were found to be significantly enriched in regions of open chromatin, as indicated by DNase1 hypersensitivity sites (DHSs) identified in the adult human brain (relative enrichment = 1.10–1.15, P = 0.00592–6.35 × 10−5;Supplementary Table 7), consistent with the observation that intermediately methylated domains, one potential consequence of allele-specific DNA methylation, are enriched in DHSs25. We also identified a significant enrichment of genotype-associated DNA methylation sites overlapping annotated transcription factor binding sites identified by the ENCODE project10,22 (relative enrichment = 1.26, P = 2.96 × 10−11; Supplementary Table 8). Of note, there was a highly significant enrichment (odds ratio = 1.35, P = 1.66 × 10−9) of fetal brain mQTLs influencing DNA methylation in CCCTC-binding factor (CTCF) motifs (Supplementary Table 8), confirming a finding from a previous study of heritable DNA methylation sites in the human brain26. CTCF is an 11 zinc-finger protein with insulator and chromatin barrier activity whose binding affinity is known to be strongly influenced by DNA methylation27. Given the important role of CTCF in core genomic processes, including transcription, chromosomal interactions and chromatin structure28, the enrichment of genetically mediated DNA methylation at CTCF binding sites highlights an important potential mechanism linking genetic variation to genomic function. In addition to CTCF, a significant enrichment was also observed in binding sites for several other transcription factors, including IRF1 (relative enrichment = 1.34, P = 1.12 × 10−6), GABP (relative enrichment = 1.32, P = 2.99 × 10−6), ELF1 (relative enrichment = 1.26, P = 6.07 × 10−6), Rad21 (relative enrichment = 1.31, P = 0.000133) and CCNT2 (relative enrichment = 1.25, P = 0.000298), with significant depletion in binding sites for others (for example, SUZ12 (relative enrichment = 0.350, P = 1.88 × 10−9) and CtBP2 (relative enrichment 0.444, P = 0.000201); Supplementary Table 8).

Although the majority of fetal brain mQTLs are conserved in adult brain regions, there are fetal-specific genetic effects on DNA methylation at certain loci

We next generated mQTL data from three adult human brain regions (prefrontal cortex (PFC), striatum (STR) and cerebellum (CER)) dissected from matched donors (n = 83; 21–96 years old; Online Methods and Supplementary Table 1) to explore the extent to which fetal brain mQTLs are also present in the adult brain. Using a replication mQTL significance threshold of P < 10−5, we found that the majority (83.46%) of fetal brain mQTLs were present in at least one of the tested adult brain region (Supplementary Table 9 and Fig. 2a) and there was a highly significant overall correlation of individual mQTL effect sizes between fetal brain and each of the individual adult brain regions (PFC: r = 0.911, P < 2.2 × 10−16; STR: r = 0.899, P < 2.2 × 10−16; CER: r = 0.835, P < 2.2 × 10−16; Supplementary Fig. 4) across all Bonferroni-significant fetal brain mQTLs, even in mQTLs that did not meet our replication threshold (Supplementary Fig. 5). Of note, fetal brain mQTLs that did not replicate in adult brain were characterized by significantly lower effect sizes across all brain regions, including the fetal brain discovery sample (P = 3.18 × 10−141; Supplementary Fig. 6). Despite the overall strong concordance in the direction of mQTL effects between fetal and adult brain, there are notable examples of heterogeneity between fetal and adult brain tissue. We used a multilevel linear regression model to test the significance of an interaction term and identify differential mQTL effects across our data sets. Of the 10,663 fetal mQTL effects that we tested, 3,173 (29.76%) were significantly heterogeneous (Bonferroni-corrected, P < 4.69 × 10−6) across the fetal and adult data sets (Supplementary Table 10). These include mQTLs that had notably larger or smaller effects in the adult brain, and fetal-specific mQTLs showing no significant association with DNA methylation in any adult brain region (Fig. 2b). We also identified a small number (n = 45) of fetal brain mQTLs that had opposite effects on DNA methylation in fetal and adult brain samples (Fig. 2c and Supplementary Table 11).

Figure 2: Despite a highly significant overall correlation of individual mQTL effects between fetal brain and adult brain regions, a subset of loci are characterized by fetal-specific mQTLs.
Figure 2

(a) Heatmap showing effect sizes in adult brain for all fetal brain mQTLs tested. Using a replication mQTL significance threshold of P < 1.00 × 10−5, the majority (83.46%) of fetal brain mQTLs are present in at least one of the tested adult brain region. Despite the overall strong concordance in the direction of mQTL between fetal and adult brain, there are notable examples of heterogeneity in mQTL effects between fetal and adult brain tissue. (b,c) Shown are examples of a fetal-specific mQTL between rs10447470 and cg07900658 (heterogeneity P = 7.23 × 10−40; b) and an mQTL between rs2108854 and cg21577356 (c) showing an opposite direction of effect in fetal brain and cerebellum (heterogeneity P = 8.39 × 10−36).

Fetal brain mQTLs overlap with genetic variants associated with RNA transcript abundance in the brain

We used eQTL data from ten adult brain regions29 to test whether identified fetal brain mQTLs overlap with genetic variants associated with RNA transcript abundance in cis. We compared the distribution of the minimum brain eQTL P values of all interrogated SNPs split into the subsets of those identified as fetal brain mQTLs and those that were not (Supplementary Fig. 7), finding that variants associated with DNA methylation were indeed more likely to be associated with gene expression in cis (Wilcoxon rank-sum test P < 2.2 × 10−16). Of the 414,172 SNPs tested in both the mQTL and eQTL data sets, 9,869 were identified as being Bonferroni-significant cis mQTLs and 2,674 as being Bonferroni-significant (P < 5.99 × 10−9) eQTLs, with an overlap of 750 variants associated with 227 DNA methylation probes and 127 transcript probes (Supplementary Table 12). At a more relaxed eQTL threshold (P < 1.00 × 10−7), there was an overlap of 1,042 variants associated with 315 DNA methylation probes and 183 transcript probes. A list of all variants associated with both DNA methylation and gene expression in cis is given in Supplementary Table 13. This overlapping set of variants likely includes multiple SNPs in LD that are associated with one gene expression transcript and DNA methylation site. Because the extent and magnitude of LD varies across the genome, we established an LD-independent set of SNPs associated with DNA methylation and tested the overlap of these with the sentinelized subsignals, that is, the most associated marker from a set in high LD (r2 > 0.8), from the brain eQTL data set29. Compared with 1,000,000 simulated mQTL SNP sets matched for allele frequency, this overlap was significantly greater than expected (relative enrichment = 4.23, P < 1.00 × 10−6 after 1,000,000 simulations; Supplementary Table 14 and Supplementary Fig. 8).

There is a significant enrichment of schizophrenia-associated GWAS variants in fetal brain mQTLs

Our catalog of fetal brain mQTLs provides a unique resource for investigating putative functional consequences of genetic variation associated with postulated neurodevelopmental disorders such as schizophrenia. A recent large-scale GWAS identified 108 independent genomic loci exhibiting genome-wide significant association with the disorder (P < 5 × 10−8), with evidence for a substantial polygenic component in signals that fall below this stringent level of significance12. Because the majority of these variants reside in regions of strong LD and do not index coding variants affecting protein structure, there remains considerable uncertainty about the causal genes involved in pathogenesis and the way in which they are functionally regulated by schizophrenia risk variants. We used PLINK30 to 'clump' our list of significant (P < 3.69 × 10−13) fetal brain mQTL variants into a set of quasi-independent SNPs (SNP pairwise r2 < 0.25 in 250 kb (non-major histocompatibility complex, MHC) or 10,000 kb (MHC); see Online Methods) and tested for enrichment of schizophrenia-associated variants across a range of GWAS significance thresholds, using up to 1,000,000 simulated SNP sets to generate empirical P values (Online Methods). We observed a highly significant enrichment (relative enrichment = 4.11, P = 3.0 × 10−6) of genome-wide significant schizophrenia risk variants amongst fetal brain mQTLs, with a trend for stronger enrichment at more stringent levels of GWAS significance (Supplementary Fig. 9 and Table 1). To examine the specificity of any enrichment, we repeated these analyses using large GWAS data sets from a non-neurodevelopmental brain disorder (Alzheimer's disease, AD31) and two non-neurological phenotypes (body mass index, BMI32, and type 2 diabetes, T2D33). Although our confidence in the enrichment of fetal brain mQTL in these data sets is limited by the smaller number of semi-independent GWAS SNPs, levels of enrichment were found to be notably lower for all other tested phenotypes (Table 1). Variants associated with AD were, however, nominally significantly enriched at the most relaxed GWAS threshold (GWAS threshold P < 5 × 10−5: relative enrichment = 3.18, P = 0.022) and several individual GWAS variants identified for this and the other tested phenotypes were also significant mQTLs in fetal brain mQTLs (P < 3.69 × 10−13; Supplementary Table 15).

Table 1: Fetal brain mQTLs are significantly enriched for schizophrenia genetic risk variants

The identified mQTLs residing in schizophrenia-associated GWAS regions do not necessarily represent the actual causal risk variants; in many instances, we are likely to be capturing 'passenger' effects whereby the variant influencing DNA methylation and the schizophrenia-associated SNP are instead co-segregating in the same LD block. Thus, we sought to identify instances in which a likely causal risk variant for schizophrenia was an mQTL SNP. We used data from the 1000 Genomes Project (http://www.1000genomes.org/) to identify all variants in strong LD (r2 > 0.8) with the 125 autosomal index SNPs associated with schizophrenia12. Of note, two of the actual schizophrenia GWAS index SNPs represented Bonferroni-significant fetal brain mQTLs: rs2535627 (associated with DNA methylation at cg11645453, P = 3.05 × 10−13; Supplementary Fig. 10) and rs4648845 (associated with DNA methylation at cg02275930, P = 4.54 × 10−15; Supplementary Fig. 11). 46 additional mQTL variants that were in strong LD with another six index SNPs were part of 86 highly significant fetal brain mQTL pairs (Supplementary Table 16).

mQTLs can be used to localize putative causal loci within large genomic regions associated with schizophrenia

To generate a more comprehensive database of mQTLs in the fetal brain and to identify more examples in which the same SNP is associated with both DNA methylation and disease, we imputed our genotype data using the most recent panel downloaded from the 1000 Genomes Project (Online Methods). Using an imputed set of 5,177,320 variants, we identified an additional 256,040 mQTLs, which reflected the non-imputed data set in terms of genomic distribution and observed effect sizes (Supplementary Table 2 and Supplementary Fig. 12). The full list of fetal brain mQTLs after imputation can be downloaded from http://epigenetics.essex.ac.uk/mQTL//All_Imputed_BonfSignificant_mQTLs.csv.gz. Our imputed data enabled us to identify 1,067 instances in which the same SNP was associated with both DNA methylation and schizophrenia, with a comprehensive list available for download at http://epigenetics.essex.ac.uk/mQTL//PGC_IndexSNPs_QTLs_Inc2PCs_AllTissues_MatchSNPPosition.csv. Because they could be biased by the LD structure at associated loci, the imputed mQTL data were not used for subsequent enrichment analyses, but they enabled us to further refine schizophrenia candidate regions and undertake colocalization analyses to identify variants associated with both DNA methylation and schizophrenia. We performed a Bayesian colocalization analysis34 across the 105 autosomal regions associated with schizophrenia12, spanning 19,378 DNA methylation sites included in our analysis. Instead of focusing only on the intersection of significant variants associated with two phenotypes independently, this approach compares the pattern of association results from the schizophrenia GWAS and mQTL analyses across a region, combining the summary statistics into posterior probabilities for five hypotheses (Online Methods). As this methodology assumes that the causal variant is present, or at least very well tagged, in the data set, these colocalization analyses were performed using our imputed fetal brain mQTL data set. The posterior probabilities for 65 regions, involving 296 DNA methylation sites in 306 pairs, were supportive of a colocalized association signal for both schizophrenia and DNA methylation in that region (PP3+PP4 > 0.99; Supplementary Table 17). 26 of these pairs (covering 15 regions associated with schizophrenia) had a higher posterior probability for both schizophrenia and DNA methylation being associated with the same causal variant (PP4/PP3 > 1), with 16 (10 regions) of these having sufficient support for them to be considered as 'convincing' (PP4/PP3 > 5) according to the criteria of a previous study34. Of note, three of these top-ranked pairs were annotated to the AS3MT locus in a robust schizophrenia-associated region on chromosome 10 (Supplementary Table 17). The utility of mQTL mapping for localizing putative causal loci associated with disease in this region is shown in Figure 3, with additional examples for other schizophrenia-associated regions available on our website (http://epigenetics.essex.ac.uk/mQTL/).

Figure 3: mQTL mapping can localize putative causal loci associated with disease.
Figure 3

Colocalization analyses yielded strong support for variants annotated to the AS3MT gene being associated with both schizophrenia and DNA methylation in a broad genomic region (chr10:104535135-105006335) identified in a recent GWAS analysis of schizophrenia12. All potential causal variants (r2 > 0.8 with index variant) present in the imputed mQTL data set are indicated by vertical blue lines and all DNA methylation probes in each region by vertical red lines. Bonferroni-significant mQTLs are indicated by black lines between the respective variant and DNA methylation probe, where the line width reflects the magnitude of the effect. Additional examples of fetal brain mQTLs in genomic regions showing genome-wide significant association with schizophrenia are available on our website (http://epigenetics.essex.ac.uk/mQTL/).

Schizophrenia-associated genomic regions are enriched for fetal-specific mQTLs

Given the hypothesized neurodevelopmental component to schizophrenia, we examined the extent to which the mQTLs overlapping with schizophrenia-associated variants are characterized by fetal-specific effects (Supplementary Table 16). Across the 78 mQTLs also tested in adult brain samples, overall effect sizes were significantly larger in fetal brain than all of the adult brain regions tested (Wilcoxon rank-sum test: PFC P = 0.0420, STR P = 0.00226, CER P = 0.00998; Fig. 4). Our heterogeneity analysis highlighted 16 (20.5%) instances in which significantly different relationships between genotype and DNA methylation were found across the adult and fetal data sets, with eight classed as fetal-specific variants (that is, those not reaching our replication threshold (P < 1.00 × 10−5) in any adult brain region) and the remaining eight demonstrating smaller effects across the adult brain.

Figure 4: Fetal mQTLs in schizophrenia-associated regions have larger effects on DNA methylation during neurodevelopment than the in adult brain.
Figure 4

For robustly associated schizophrenia GWAS variants characterized by human fetal brain mQTLs (Supplementary Table 16), we compared effect sizes between fetal and adult brain. Effect sizes (red to blue) for the corresponding mQTLs in adult brain were significantly lower across all three adult brain regions tested (Wilcoxon rank-sum test: PFC P = 0.0420, STR P = 0.00226, CER P = 0.00998). The heterogeneity P value for each mQTL is depicted by the green (high heterogeneity) to yellow (low heterogeneity) column. White indicates that the mQTL was not tested in the adult brain samples.

Discussion

To explore the functional consequences of genetic variation in the developing human brain, we characterized mQTLs in human fetal brain samples (spanning 56–166 d post-conception), identifying over 16,000 associated pairs of SNPs and DNA methylation sites. We found that fetal brain mQTLs were significantly enriched in functional regulatory domains, including DHSs, regions of repressive histone modifications and specific transcription factor binding sites across the genome, and observed significant overlap with genetic variants influencing gene expression in the brain. Although the majority of fetal brain mQTLs appear to be conserved across adult brain regions, we found evidence for fetal-specific genetic effects at certain loci. Our data concur with findings from an independent study of cortical mQTLs across development35; mQTL effects were highly consistent across both analyses (Supplementary Fig. 13) and largely conserved between fetal and adult brain.

There is growing evidence that the majority of common variants associated with complex traits act through effects on gene regulation10,11. Our data add to a growing literature showing that DNA methylation is genetically influenced26, with mQTLs representing a potential mechanism linking genetic variation to complex phenotypes4,9,36,37. We found a significant enrichment of schizophrenia-associated GWAS variants in fetal brain mQTLs, indicating that common genetic variants conferring risk for schizophrenia are associated with altered DNA methylation in fetal human brain. The hypothesis that schizophrenia has an early neurodevelopmental component is supported by several lines of epidemiological and neuropathological evidence13,14. However, direct molecular evidence of schizophrenia risk factors operating in the fetal brain is scarce38,39. We recently found that genomic loci that are differentially methylated between schizophrenia patients and unaffected controls in the adult brain are enriched at those undergoing dynamic changes in DNA methylation during human fetal brain development6,40. Here we found that genetic variants exhibiting genome-wide significant association with schizophrenia12 showed a fourfold enrichment amongst fetal brain mQTLs, directly implicating altered gene regulation during fetal brain development in the etiology of the disorder.

To conclude, we report, to the best of our knowledge, the first systematic analysis of genetically mediated DNA methylation in the developing human brain. Our data support the hypothesis that a substantial proportion of the genetic variants conferring schizophrenia risk have regulatory effects that become manifest early in the prenatal period and demonstrate the utility of mQTL mapping for localizing putative causal loci associated with complex disease phenotypes in large genomic regions. As a resource to the wider community, we have developed a searchable online database of fetal brain mQTLs that can be accessed at http://epigenetics.essex.ac.uk/mQTL/.

Methods

Human brain samples.

Human fetal brain tissue was acquired from the Human Developmental Biology Resource (HDBR) (http://www.hdbr.org) and MRC Brain Banks network (http://www.mrc.ac.uk/research/facilities/brain-banks/access-for-research). Ethical approval for the HDBR was granted by the Royal Free Hospital research ethics committee under reference 08/H0712/34 and Human Tissue Authority (HTA) material storage license 12220; ethical approval for MRC Brain Bank was granted under reference 08/MRE09/38. A detailed description of these samples can be found elsewhere6. Briefly 173 fetal brain samples (94 male, 79 female) ranging from 56–169 d post-conception were used for DNA methylation and SNP profiling. Brain tissue was obtained frozen and had not been dissected into regions. Half of the brain tissue from each individual fetus was homogenized for subsequent genomic DNA extraction. Postnatal prefrontal cortex (PFC), striatum (STR) and cerebellum (CER) samples were obtained from the MRC London Neurodegenerative Disease Brain Bank and the Douglas Bell-Canada Brain Bank (DBCBB) (http://www.douglasbrainbank.ca) and included both schizophrenia and controls. Postmortem brain specimens were collected postmortem following consent obtained with next of kin, dissected by neuropathology technicians, snap-frozen and stored at –80 °C. Genomic DNA was isolated from all brain samples using a standard phenol-chloroform extraction protocol. DNA was tested for degradation and purity using spectrophotometry and gel electrophoresis.

Genome-wide quantification of DNA methylation.

500 ng of DNA from each sample was treated with sodium bisulfite in duplicate, using the EZ-96 DNA Methylation kit (Zymo Research). DNA methylation was quantified using the Illumina Infinium HumanMethylation450 BeadChip (Illumina) run on an Illumina iScan System (Illumina) using the manufacturers' standard protocol. Signal intensities for each probe were extracted using Illumina GenomeStudio software (Illumina) and imported into the R statistical program using the methylumi and minfi packages41,42. Multi-dimensional scaling (MDS) plots of variable probes on the sex chromosomes were used to check that the predicted gender corresponded with the reported gender for each individual. Further data quality control and processing steps were conducted using the wateRmelon package43 in R. The pfilter function was used to filter first samples with >1% probes with a detection P value > 0.05 were removed and probes with a detection P value > 0.05 in at least 1% samples or/and a beadcount <3 in 5% of samples were removed across all samples to control for poor quality probes. The dasen function was used to normalize the data as previously described44. Cross-hybridizing probes44,45, probes with any SNP in 10 bp of the CpG site or single base extension44 and probes on the sex chromosomes were excluded from the QTL analysis. These data are publically available through GEO and can be found under accession numbers: GSE58885, GSE61431, GSE61380. Genotype data is available to access from dbSNP. All fetal brain mQTL data are also available via an online database at http://epigenetics.essex.ac.uk/mQTL/.

Genome-wide SNP genotyping.

200 ng of genomic DNA from each sample was genotyped using the Illumina HumanOmniExpress BeadChip (Illumina). Following scanning, Illumina GenomeStudio software was used for genotype calling and the data were exported as ped and map files. PLINK30 was used to remove samples with >5% missing values, and SNPs with > 1% missing values, Hardy-Weinburg equilibrium P < 0.001, and a minor allele frequency of <5%. Subsequently, SNPs were also filtered so that each of the three genotype groups with 0, 1 or 2 minor alleles (or two genotype groups in the case of rare SNPs with 0 or 1 minor allele) had a minimum of 5 observations.

Methylation QTL (mQTL) analyses.

Before commencing QTL analyses, genotypes at the polymorphic SNP probes on the HumanMethylation 450K array were compared to calls from the HumanOmniExpress genotyping array to confirm sample identity. All genome-wide SNP-methylation probe pairs were tested using the R package MatrixEQTL46. This package enables fast computation of QTLs by only saving those more significant than a pre-defined threshold (set to P = 0.0001 for these analyses). An additive linear model was fitted to test if the number of alleles (coded 0, 1 and 2) predicted DNA methylation (beta value 0–100) at each site, including covariates for age, sex and the first two principal components from the genotype data to control for ethnicity differences. In addition, a brain bank covariate was also included for the adult data sets.

Identifying overlap and testing for enrichment of expression QTLs (eQTLs) among fetal brain mQTLs.

P values for all cis eQTLs (within 1Mb) were supplied by the authors of a recent manuscript documenting eQTLs in the human brain29 to enable a more thorough examination of the overlap of eQTL and mQTL. To identify all variants associated with DNA methylation and gene expression in cis, our definition of cis mQTL was relaxed to match that used in the eQTL study. Chromosome and base position of the SNPs were used to map between the two data sets. A Bonferroni significance threshold was established for the eQTL results (P from aveALL analysis < 5.99 × 10−9), based on the number of cis eQTLs tested across all SNPs overlapping with those tested in the mQTL data set (414,172), in addition to two more relaxed exploratory thresholds (P < 10−9; 10−7).

Prior to testing for a significant overlap with SNPs associated with brain eQTLs all SNPs associated with at least one DNA methylation site in the fetal brain were 'clumped' based on their best mQTL P value using PLINK30 to create a list of quasi-independent SNPs (r2 < 0.25 for all pairs of SNPs within 250 kb) and prevent LD between SNPs in the set biasing the results. Given the extensive correlation between variants in the major histocompatibility complex (MHC) region, a more stringent clumping procedure was used for SNPs located in chr6:25000000–35000000, where the window for pairwise SNP comparisons was extended to 10,000 kb. To test for a larger overlap than expected by chance, up to 1,000,000 simulated sets, matched for allele frequency, were drawn to calculate the expected overlap and generate empirical P values. SNPs were categorised into MAF bins split at intervals of 2%, and SNPs from each bin were selected to match the distribution in the test set. Empirical significance for enrichment of eQTLs in mQTLs was ascertained by counting the number of simulations with at least as many SNPs overlapping the set of sentinelized subsignals from the aveALL analysis described previously29, as the true 'clumped' Bonferroni significant mQTL SNP set and dividing by the number of simulations performed. Fold change statistics were calculated as the true overlap divided by the mean overlap of these simulations, and 95% confidence intervals as the true overlap divided by the 2.5th and 97.5th quantiles of the distribution of overlaps.

Enrichment of regulatory regions.

Published 450K array probe annotations22 were used to identify probes located in transcription factor binding sites (TFBSs) or DNase1 hypersensitivity sites (DHSs) based on data made publically available as part of the ENCODE project10,25. In addition, brain specific DHSs were downloaded from the UCSC (University of California, Santa Cruz) Genome Browser for 'Frontal_cortex_OC', 'Cerebellum_OC' and 'Cerebrum_frontal_OC' and used to annotate DNA methylation sites in the same manner. Peaks associated with 5 histone modifications identified separately in two fetal brain samples (17 weeks gestation; 1 male, 1 female; sample IDs E081 and E082) were downloaded from the Epigenomics Roadmap project23. Due to the heterogeneity in the Chip-seq profiles, presumed due to experimental differences rather than biological differences47, DNA methylation sites had to be located within peaks generated from both brain samples to be classed as overlapping any of the histone marks. The overlap between regulatory features and the DNA methylation sites identified from the set of Bonferroni significant mQTLs in the fetal brain data set was tested for enrichment using a two sided Fisher's 2 × 2 exact test. The significance level for enrichment of overlap with transcription factor binding sites was calculated using a Bonferroni correction for the 149 different transcription factor binding sites tested.

Heterogeneity model.

All Bonferroni-significant mQTLs (P < 3.69 × 10−13) identified in the fetal brain, for which corresponding mQTL data was available from all three adult brain regions, were tested for heterogeneous relationships between DNA methylation and genetic variation across the data sets (n = 10,663). A null model of no heterogeneity was fitted in line with the linear model fitted to test for mQTL effects between the number of alleles (coded 0, 1 and 2) and DNA methylation (beta value 0-100) with fixed effect covariates for sex, age and the first two genetic principal components. As the adult brain regions were dissected from the same set of individuals, we expect their DNA methylation values to be correlated. In addition, we expect DNA methylation values within a brain region to be correlated, and therefore both of these covariates were included as random effects in addition to an indicator variable discriminating fetal from adult samples to control for absolute differences in DNA methylation level associated with age/development stage. This was compared to a heterogeneity model which included an interaction between genotype and development stage indicator with an ANOVA to calculate the heterogeneity P value.

Enrichment of disease-associated variants among fetal brain mQTLs.

A similar simulation procedure to that used to test the overlap of mQTLs and eQTLs was used to test for a larger overlap than expected by chance between fetal brain mQTL SNPs and those identified in GWAS of complex disorders including: schizophrenia12, Alzheimer's disease31, body mass index (BMI)32, and type 2 diabetes33. The clumping procedure as described for the eQTL enrichment analysis was repeated separately for each phenotype to ensure that the best mQTL SNP present in those analyzed in the GWAS was retained. Up to 1 million simulations were performed to generate the expected overlap between the set of mQTL SNPs and variants associated with each disorder at four GWAS significance thresholds (P < 5 × 10−5, 5 × 10−6, 5 × 10−7, 5 × 10−8) and derive fold change statistics and empirical P values.

Imputation.

Prior to imputation PLINK30 was used to remove samples with >5% missing data. We also excluded SNPs characterized by >1% missing values, a Hardy-Weinberg equilibrium P < 0.001 and a minor allele frequency of <5%. These were recoded as vcf files using PLINK1.9 (ref. 48) and VCFtools49 before uploading to the Michigan Imputation Server (https://imputationserver.sph.umich.edu/start.html#!pages/home) which uses SHAPEIT50,51 to phase haplotypes, and Minimac352 with the most recent 1000 Genomes reference panel (phase 3, version 5). Imputed genotypes were then filtered and recoded with PLINK1.9 (ref. 48) removing samples with >5% missing values, and SNPs with >2 alleles, those indicated as a fail in the FILTER columns using the flag '–vcf-filter', in addition to those characterized by >1% missing values, a Hardy-Weinberg equilibrium P < 0.001, a minor allele frequency of <5%, or <5 observations for any genotype group in line with the SNP filtering for the raw genotype groups. This resulted in 5,177,320 variants in the imputed set of genotypes. MatrixEQTL48 was used to test genome-wide mQTLs as previously described, except only mQTL with P < 10−8 were recorded.

Colocalization analyses.

Schizophrenia associated genomic loci were taken as the 105 autosomal regions published as part of the PGC mega-analysis12. Given our definition of cis mQTLs (that is, associations between SNPs and DNA methylation probes within 500 kb), all DNA methylation sites located within 500 kb of these regions were identified and cis mQTL analysis was repeated using the imputed genotypes using MatrixEQTL46 and recording all mQTL results. Colocalization analysis was performed as previously described34 using the R coloc package (http://cran.r-project.org/web/packages/coloc) for each DNA methylation site within each region. In total 19,607 possible mQTLs were tested. From both the PGC schizophrenia GWAS data and our mQTL results we inputted the regression coefficients, their variances and SNP minor allele frequencies, and the prior probabilities were left as their default values. This methodology quantifies the support across the results of each GWAS for five hypotheses by calculating the posterior probabilities, denoted as PPi for hypothesis Hi.

H0: there exist no causal variants for either trait;

H1: there exists a causal variant for one trait only, schizophrenia;

H2: there exists a causal variant for one trait only, DNA methylation;

H3: there exist two distinct causal variants, one for each trait;

H4: there exists a single causal variant common to both traits.

Code availability.

Annotated analysis scripts for the analyses used in this study are available for download at http://epigenetics.essex.ac.uk/mQTL/ and in the Supplementary Material.

A Supplementary Methods Checklist is available.

Accessions

Gene Expression Omnibus

References

  1. 1.

    et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).

  2. 2.

    Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484–492 (2012).

  3. 3.

    et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).

  4. 4.

    et al. The relationship between DNA methylation, genetic and expression inter-individual variation in untransformed human fibroblasts. Genome Biol. 15, R37 (2014).

  5. 5.

    et al. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature 466, 253–257 (2010).

  6. 6.

    et al. Methylomic trajectories across human fetal brain development. Genome Res. 25, 338–352 (2015).

  7. 7.

    et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 6, e1000952 (2010).

  8. 8.

    et al. Enrichment of cis-regulatory gene expression SNPs and methylation quantitative trait loci among bipolar disorder susceptibility variants. Mol. Psychiatry 18, 340–346 (2013).

  9. 9.

    et al. Passive and active DNA methylation and the interplay with genetic variation in gene regulation. Elife 2, e00523 (2013).

  10. 10.

    et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

  11. 11.

    et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

  12. 12.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

  13. 13.

    & The neurodevelopmental hypothesis of schizophrenia, revisited. Schizophr. Bull. 35, 528–548 (2009).

  14. 14.

    From neuropathology to neurodevelopment. Lancet 346, 552–557 (1995).

  15. 15.

    et al. The effect of genotype and in utero environment on inter-individual variation in neonate DNA methylomes. Genome Res. 24, 1064–1074 (2014).

  16. 16.

    et al. Genome-wide associations between genetic and epigenetic variation influence mRNA expression and insulin secretion in human pancreatic islets. PLoS Genet. 10, e1004735 (2014).

  17. 17.

    et al. The presence of methylation quantitative trait loci indicates a direct genetic influence on the level of DNA methylation in adipose tissue. PLoS ONE 8, e55923 (2013).

  18. 18.

    et al. Tissue-specific effects of genetic and epigenetic variation on gene regulation and splicing. PLoS Genet. 11, e1004958 (2015).

  19. 19.

    et al. Long-range epigenetic regulation is conferred by genetic variation located at thousands of independent loci. Nat. Commun. 6, 6326 (2015).

  20. 20.

    , & Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data. Nat. Rev. Genet. 14, 390–403 (2013).

  21. 21.

    , , , & Interchromosomal associations between alternatively expressed loci. Nature 435, 637–645 (2005).

  22. 22.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  23. 23.

    et al. Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array. Epigenetics Chromatin 6, 26 (2013).

  24. 24.

    Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  25. 25.

    et al. Intermediate DNA methylation is a conserved signature of genome regulation. Nat. Commun. 6, 6363 (2015).

  26. 26.

    et al. Contribution of genetic variation to transgenerational inheritance of DNA methylation. Genome Biol. 15, R73 (2014).

  27. 27.

    et al. Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res. 22, 1680–1688 (2012).

  28. 28.

    & CTCF: an architectural protein bridging genome topology and function. Nat. Rev. Genet. 15, 234–246 (2014).

  29. 29.

    et al. Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci. 17, 1418–1428 (2014).

  30. 30.

    et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  31. 31.

    et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat. Genet. 45, 1452–1458 (2013).

  32. 32.

    et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

  33. 33.

    et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).

  34. 34.

    et al. Bayesian test for co-localization between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

  35. 35.

    et al. Mapping DNA methylation across development, genotype, and schizophrenia in the human frontal cortex. Nat. Neurosci. advance online publication, (2015).

  36. 36.

    , & Allele-specific methylation in the human genome: implications for genetic studies of complex disease. Epigenetics 5, 578–582 (2010).

  37. 37.

    et al. Identification of schizophrenia-associated loci by combining DNA methylation and gene expression data from whole blood. Eur. J. Hum. Genet. 23, 1106–1110 (2015).

  38. 38.

    & Evidence that schizophrenia risk variation in the ZNF804A gene exerts its effects during fetal brain development. Am. J. Psychiatry 169, 1301–1308 (2012).

  39. 39.

    et al. Expression of ZNF804A in human brain and alterations in schizophrenia, bipolar disorder, and major depressive disorder: a novel transcript fetally regulated by the psychosis risk variant rs1344706. JAMA Psychiatry 71, 1112–1120 (2014).

  40. 40.

    et al. Methylomic profiling of human brain tissue supports a neurodevelopmental origin for schizophrenia. Genome Biol. 15, 483 (2014).

  41. 41.

    , , , & methylumi: Handle Illumina methylation data. R package version 2.14.0. <> (2015).

  42. 42.

    et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).

  43. 43.

    et al. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics 14, 293 (2013).

  44. 44.

    et al. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8, 203–209 (2013).

  45. 45.

    et al. Additional annotation enhances potential for biologically relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array. Epigenetics Chromatin 6, 4 (2013).

  46. 46.

    Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

  47. 47.

    & Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).

  48. 48.

    et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

  49. 49.

    et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

  50. 50.

    , & A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).

  51. 51.

    , & Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

  52. 52.

    , , , & Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

Download references

Acknowledgements

We thank M. Weale for providing eQTL data from the BRAINEAC database. This work was supported by grants from the UK Medical Research Council (MRC; MR/K013807/1 to J.M. and MR/L010674/1 to N.J.B.) and the US National Institutes of Health (AG036039) to J.M. R.P. and H.S. were funded by MRC PhD studentships. The human embryonic and fetal material was provided by the Joint MRC/Wellcome Trust (grant #099175/Z/12/Z) Human Developmental Biology Resource.

Author information

Author notes

    • Nicholas J Bray
    •  & Jonathan Mill

    These authors contributed equally to this work.

Affiliations

  1. University of Exeter Medical School, University of Exeter, Exeter, UK.

    • Eilis Hannon
    • , Joana Viana
    • , Joe Burrage
    • , Therese M Murphy
    •  & Jonathan Mill
  2. Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.

    • Helen Spiers
    • , Claire Troakes
    • , Nicholas J Bray
    •  & Jonathan Mill
  3. Garvan Institute of Medical Research, Sydney, NSW, Australia.

    • Ruth Pidsley
  4. Douglas Mental Health Institute, McGill University, Montreal, QC, Canada.

    • Gustavo Turecki
  5. MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University School of Medicine, Cardiff, UK.

    • Michael C O'Donovan
    •  & Nicholas J Bray
  6. School of Biological Sciences, University of Essex, Colchester, UK.

    • Leonard C Schalkwyk

Authors

  1. Search for Eilis Hannon in:

  2. Search for Helen Spiers in:

  3. Search for Joana Viana in:

  4. Search for Ruth Pidsley in:

  5. Search for Joe Burrage in:

  6. Search for Therese M Murphy in:

  7. Search for Claire Troakes in:

  8. Search for Gustavo Turecki in:

  9. Search for Michael C O'Donovan in:

  10. Search for Leonard C Schalkwyk in:

  11. Search for Nicholas J Bray in:

  12. Search for Jonathan Mill in:

Contributions

J.M. and N.J.B. conceived and supervised the study and obtained funding. E.H. undertook primary data analysis and bioinformatics. L.C.S. provided analytical support. H.S., J.V., R.P., T.M.M. and J.B. performed laboratory work. C.T. and G.T. provided samples for analysis. M.C.O'D. provided support for GWAS enrichment analyses. E.H., N.J.B. and J.M. drafted the manuscript. All of the authors read and approved the final submission.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Jonathan Mill.

Integrated supplementary information

Supplementary figures

  1. 1.

    The distribution of effect sizes across all Bonferroni significant fetal brain mQTLs.

  2. 2.

    Frequency distribution of mQTL SNPs and associated DNA methylation sites.

  3. 3.

    The statistical significance of association between genotype and DNA methylation is related to the distance between the Illumina 450K array probe and mQTL SNP.

  4. 4.

    There is a highly-significant correlation of individual mQTL effects between fetal brain and each of the individual adult brain regions.

  5. 5.

    The correlation of mQTL effect sizes (% DNA methylation change per allele) between fetal brain and adult brain is stronger for replicating variants (left) than non-replicating variants (right).

  6. 6.

    Fetal brain mQTLs that do not replicate in adult brain are characterized by significantly smaller effect sizes across all brain regions, including the initial fetal brain discovery sample (P = 3.18×10−141).

  7. 7.

    SNPs associated with DNA methylation are more significantly associated with gene expression than non-mQTL variants.

  8. 8.

    The overlap between independent fetal brain mQTL signals with the sentinelized subsignals from a brain eQTL dataset.

  9. 9.

    Fetal brain mQTLs ar1e significantly enriched for schizophrenia genetic risk variants.

  10. 10.

    Boxplot of mQTL effects observed for rs2535627, an index SNP from the recent schizophrenia GWAS.

  11. 11.

    Boxplot of mQTL effect observed for rs4648845, an index SNP from a recent schizophrenia GWAS.

  12. 12.

    mQTLs identified using imputed genetic data reflected the non-imputed dataset in terms of genomic distribution and observed effect sizes.

  13. 13.

    Effect sizes at fetal brain mQTLs identified in this study are highly correlated with those identified in an independent study of cortical mQTLs across development.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–13

  2. 2.

    Supplementary Methods Checklist

Excel files

  1. 1.

    Supplementary Table 1: Summary of demographic data for each of the four brain mQTL datasets generated in this study.

    Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  2. 2.

    Supplementary Table 2: Summary of Bonferroni-significant fetal brain mQTLs.

  3. 3.

    Supplementary Table 3: Annotated list of all Bonferroni significant fetal brain mQTLs.

    For each significant (P < 3.69×10-13) mQTL, associated DNA methylation sites are annotated with genomic location, Illumina gene annotation, and ENCODE transcription factor binding site or DNaseI hypersensitivity site (DHS) overlap. mQTL SNPs are annotated with genomic location. Regression coefficients and P-values are provided for each mQTL across for fetal brain, and three adult brain region datasets. A list of all significant fetal brain mQTLs generated using imputed genotypes is available for download from: http://epigenetics.essex.ac.uk/mQTL/. Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum, nt = not tested in that dataset, ns = not significant (P > 0.0001) in that dataset.

  4. 4.

    Supplementary Table 4: Many fetal brain trans-mQTLs are observed in non-neural tissues.

    For each DNA methylation probe identified as being influenced by a trans-mQTL in fetal brain, the strongest corresponding trans-mQTL effect in two published genome-wide mQTL analyses in pancreatic islets [Olsson, A.H. et al. PLoS Genet 10, e1004735 (2014)] and lymphocytes [Lemire, M. et al. Nat Commun 6, 6326 (2015)] are presented.

  5. 5.

    Supplementary Table 5: Overlap of fetal brain mQTLs in non-neural tissues.

    The set of DNA methylation probes associated with mQTL SNPs in fetal brain were compared to the set of DNA methylation probes associated with mQTL SNPs in two published genome-wide mQTL analyses undertaken in pancreatic islets [Olsson, A.H. et al. PLoS Genet 10, e1004735 (2014)] and lymphocytes [Lemire, M. et al. Nat Commun 6, 6326 (2015)].

  6. 6.

    Supplementary Table 6: Enrichment of human fetal brain mQTLs in ChIP-seq peaks for regulatory histone modifications in fetal brain.

    We tested for enrichment of genetically-mediated DNA methylation sites in fetal brain histone modification ChIP-seq peaks identified by the Roadmap Epigenomics Project (http://www.roadmapepigenomics.org/).

  7. 7.

    Supplementary Table 7: Enrichment of human fetal brain mQTLs in DNase hypersensitivity sites (DHSs).

    We tested for enrichment of genetically-mediated DNA methylation sites in DHSs identified by the ENCODE project.

  8. 8.

    Supplementary Table 8: Enrichment of human fetal brain mQTLs in transcription factor binding sites (TFBSs).

    We tested for enrichment of genetically-mediated DNA methylation sites in TFBSs identified by the ENCODE project.

  9. 9.

    Supplementary Table 9: Comparison of mQTLs across fetal and adult brain regions.

    Summary statistics for Bonferroni significant fetal brain mQTLs are presented at a range of P-value replication thresholds. Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  10. 10.

    Supplementary Table 10: Fetal brain mQTLs with significant heterogeneity between fetal and adult datasets.

    Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  11. 11.

    Supplementary Table 11: Fetal brain mQTLs characterized by opposite direction of effects in at least one adult brain region.

    Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  12. 12.

    Supplementary Table 12: Summary of overlap between fetal brain mQTLs and brain expression QTLs (eQTLs).

  13. 13.

    Supplementary Table 13: Genetic variants associated with DNA methylation and gene expression in the human brain.

    All pairs of mQTLs and eQTLs for the intersecting set of genetic variants where the same SNP was found to influence DNA methylation and gene expression in cis.

  14. 14.

    Supplementary Table 14: Enrichment of brain eQTLs in human fetal brain mQTLs.

  15. 15.

    Supplementary Table 15: GWAS variants identified as human fetal brain mQTLs.

    All genome-wide significant variants (P < 5 ×10−8) identified for Alzheimer's disease and BMI prior to LD clumping that may mediate DNA methylation are included in this table along with their Bonferroni significant mQTLs. There were no genome-wide significant type 2 diabetes variants that were Bonferroni significant mQTLs. A comparable table identifying the overlap with mQTLs generated using imputed genotypes is available for download from: http://epigenetics.essex.ac.uk/mQTL/. Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  16. 16.

    Supplementary Table 16: Robustly-associated schizophrenia GWAS variants characterized by human fetal brain mQTLs.

    The set of 'likely causal variants' for schizophrenia was defined as all index variants representing the 125 independently associated autosomal loci in the latest PGC schizophrenia GWAS3. Given that the causal variant at these loci is yet to be established, this list was extended to include additional variants in strong LD (r2 < 0.8 based on 1000 Genomes European populations) with these 125 autosomal index variants. For this list of 'likely causal variants', all Bonferroni significant fetal brain mQTLs were identified. A comparable table identifying the overlap with mQTLs generated using imputed genotypes is available for download from: http://epigenetics.iop.kcl.ac.uk/mQTL/. Abbreviations: PFC = prefrontal cortex, STR = striatum, CER = cerebellum.

  17. 17.

    Supplementary Table 17

    Genomic loci with evidence for association with both schizophrenia and DNA methylation. Co-localization analysis was performed using the publically available summary statistics from the PGC schizophrenia GWAS and imputed mQTLs generated in this study. This table contains the 306 pairs where the association for schizophrenia and a DNA methylation site overlap the same genomic region. This includes instances where either the same causal variant is associated with both phenotypes or there are separate signals for each disorder. The posterior probabilities for each hypothesis are reported (see Online Methods for details).

Zip files

  1. 1.

    Supplementary Analysis Scripts: Rscripts

    R code for analyses and figures.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nn.4182

Further reading