Susceptible genes and disease mechanisms identified in frontotemporal dementia and frontotemporal dementia with Amyotrophic Lateral Sclerosis by DNA-methylation and GWAS

Frontotemporal dementia (FTD) is a neurodegenerative disorder predominantly affecting the frontal and temporal lobes. Genome-wide association studies (GWAS) on FTD identified only a few risk loci. One of the possible explanations is that FTD is clinically, pathologically, and genetically heterogeneous. An important open question is to what extent epigenetic factors contribute to FTD and whether these factors vary between FTD clinical subgroup. We compared the DNA-methylation levels of FTD cases (n = 128), and of FTD cases with Amyotrophic Lateral Sclerosis (FTD-ALS; n = 7) to those of unaffected controls (n = 193), which resulted in 14 and 224 candidate genes, respectively. Cluster analysis revealed significant class separation of FTD-ALS from controls. We could further specify genes with increased susceptibility for abnormal gene-transcript behavior by jointly analyzing DNA-methylation levels with the presence of mutations in a GWAS FTD-cohort. For FTD-ALS, this resulted in 9 potential candidate genes, whereas for FTD we detected 1 candidate gene (ELP2). Independent validation-sets confirmed the genes DLG1, METTL7A, KIAA1147, IGHMBP2, PCNX, UBTD2, WDR35, and ELP2/SLC39A6 among others. We could furthermore demonstrate that genes harboring mutations and/or displaying differential DNA-methylation, are involved in common pathways, and may therefore be critical for neurodegeneration in both FTD and FTD-ALS.

. Differential methylated genes for FTD and FTD-ALS. Violin plot depicting the significant differential cytosine DNA-methylated genes are depicted for (A) all the 128 FTD cases, and (B) the FTD-ALS cases by comparison to control cases. Genes are sorted based on T-statistics. Genes with relative value lower than 0 are DNA hypomethylated and relative value higher than 0 are DNA hypermethylated. The green, purple and yellow lines depict the average value for respectively FTD-ALS, FTD and control cases. (C) Principal Component analysis on the differential DNA-methylated probes from FTD-ALS.
SCIeNTIFIC REpoRTS | 7: 8899 | DOI: 10.1038/s41598-017-09320-z Comparison of the FTD-ALS cases (n = 7) to the controls resulted in 200 significant differential cytosine DNA-methylation probes (P BH < 0.05, Fig. S1 panel A-B), annotated for 224 unique genes (Fig. 1B). Note that none of the 14 genes identified for FTD were found among the 224 genes for FTD-ALS. Moreover, 140 probes (mapped to 163 genes) showed relatively lower methylation levels compared to controls with average value below 0, indicating a DNA hypomethylation state. The remaining 60 probes (mapped to 63 genes) showed, with an average value above 0, relatively higher methylation levels then controls indicating a DNA hypermethylation state (Figs 1B, S2B). The unique DNA-methylation profiles were even more stressed by the Principal Component Analysis, and Davies-Bouldin index to determine the number of clusters, which resulted in an exclusive and significant grouping of the FTD-ALS cases (P = 5.23 × 10 −11 , Fig. 1C).
Genes that are specific for brain tissue show significant overlap with the associated FTD-ALS DNA-methylated genes. To test whether the detected differential DNA-methylation genes for FTD (n = 14 genes) and, FTD-ALS (n = 224 genes) are associated with expression in brain, we utilized RNA sequencing data, containing 16,115 expression levels of genes, from the GTEx consortium on 1,641 samples and over 25 unique tissue types [25][26][27] (more details can be found in method section: Tissue-type association). To determine tissue enrichment, we marked the genes that are specific for each tissue-type by comparing tissue-specific versus remaining samples, under the restriction that expression levels were significantly different with P BH < 0.05 (corrected for 16,115 tests) using the Students T-test and with absolute Fold-difference of >1.5 ( Fig. 2A). Genes associated with FTD and FTD-ALS were subsequently tested for significant enrichment for any of the tissue-specific-gene sets using the hypergeometric test.
No tissue-specific significant enrichment was seen for the 14 FTD associated differentially DNA-methylated genes after multiple test correction. For FTD-ALS cases, we did detect three significantly associated tissue types out of 25 tissues tested, namely Blood (P BH = 0.01), Brain (P BH = 0.02), and Liver (P BH = 0.04) (Fig. S1A), based on the 224 differential DNA-methylated genes. More specific, across the brain regions we detected significant overrepresentation for Parietal Neocortex (P BH = 9.41 × 10 −4 ) and Primary Motor-Sensory Cortex (P BH = 0.031). This indicates that methylation changes detected in peripheral blood of FTD-ALS cases could also be reflective of changes in other tissues, including the brain.

DNA-methylation profiles of FTD-ALS patients reflect biological processes essential in
Prefrontal, Primary Motor-Sensory Cortex, and Parietal Neocortex. Next, we addressed the question whether the hyper/hypo DNA-methylated genes in FTD, and separately in FTD-ALS, are significantly overrepresented among genes that are specific for any of the brain regions (instead of tissue types as demonstrated in the previous section). We tested for significant overrepresentation based on RNA-sequencing (525 samples across 26 brain regions), DNA-methylation (177 samples over 17 brain regions), and pre-defined gene sets (n = 22) from BrainSpan 28 by using the procedure as outlined in Fig. 2A.
For the 224 associated genes in FTD-ALS, we detected significant overrepresentation with Parietal Neocortex (P BH = 9.41 × 10 −4 ) and Primary Motor-Sensory Cortex (P BH = 0.031, Fig. S3B) using the RNA-sequencing data, after correcting for the 26 performed tests. Based on the DNA-methylation profiles, we detected significant overrepresentation in 14 specific brain regions (Fig. 2B), among which Primary Visual Cortex (P BH = 2.05 × 10 −7 ), Primary Motor Cortex (P BH = 0.0441), Dorsolateral Prefrontal Cortex (P BH = 0.0351), and Inferolateral Temporal Cortex (P BH = 0.0051), after correcting for the 17 performed tests. Finally, for the pre-defined gene sets we detected borderline significance for the Medial Prefrontal Cortex tissue (P BH = 0.05). In general, we observed that the majority of DNA-methylated genes in FTD-ALS (176/224, Fig. 2C) overlaps with the genes that are significantly differentially expressed in any of the 14 brain regions. For FTD we detected no significant overrepresentation of the 14 genes among any of the brain specific regions (P BH < 0.05). The hypergeometric test is used to compute P-value for each tissue based on the following parameters; total number of genes (M), number of tissue specific genes (K), number of significant differential methylated genes (N), and the overlap of significant differential methylated genes and the genes in the tissue specific gene set (x). The final P-value (P*) is corrected for multiple testing using Benjamini and Hochberg, and used for tissue selection with P BH < 0.05. (B) Enriched tissues in FTD-ALS sorted in -log10(P BH ). (C) Genes are colored with the brain tissue specific color if overlap is seen with any of the significantly differential DNAmethylated genes in FTD-ALS.
We hypothesized that genes that contain potential risk SNPs and have a differential DNA-methylation profile, may have increased susceptibility for differences in gene-transcript levels, and may therefore be implicated in the disease development. To test this hypothesis, we utilized GWAS summary statistics for FTD, and separately for FTD-ALS 4 , and extracted all SNPs with unadjusted P < 0.05. Note that the corrected P-value threshold for GWAS does only yield in few genes but we hypothesized that multiple but relatively smaller effects can have impact on the functional level.
For FTD-ALS this yielded 5,535 SNPs, annotated to 4,147 unique genes using ANNOVAR 29 . First, we overlaid the 4,147 genes with the 224 genes as per the FTD-ALS DNA-methylation markers and detected a significant overlap based on the hypergeometric test (53 genes, P = 0.0005, Table S2, using as background the total number of unique HG19 genes). This indicates that in FTD-ALS, non-random genes were detected with both risk SNPs and differences in DNA-methylation levels. To further refine the potential candidate genes, we removed intronic, intergenic and synonymous SNPs and incorporated the CADD score to determine the deleteriousness.  Table 1. Candidate list of genes that display abnormal DNA-methylation levels, and harbor risk-SNPs. Detection of 26 candidate genes for 30 SNPs for FTD-ALS. Genes are grouped in DNA hypomethylated (DMP T-statistics < 0) and hypermethylated genes (DMP T-statistics > 0) followed by P-value significance of GWAS. Chr: Chromosome. GWAS P: P-value significance for the phenotype association. GWAS CADD score: quantifies the deleteriousness of the SNP in the gene (the higher the worse). GWAS min(P) for gene: Minimum P-value significance for the phenotype association without excluding intronic, and intergenic SNPs. DMP P BY : P-value for the DNA-methylation difference between FTD-ALS vs Control cases after multiple test correction using Benjamini and Hochberg. DMP T-stat: T-statistics. Network Gene-degree: the number of edges the gene contains in the co-expression network ( This filtering step yielded in 26 candidate genes for 30 SNPs (with CADD-score > 15) that are nonsynonymous or stopgain in exonic or splicing regions ( Table 1). The 26 genes could be categorized into genes with DNA hypermethylation (n = 8) and hypomethylation (n = 18) status. None of the 30 SNPs occurred exactly in a DNA-methylation probe-region. The most significant SNP association, detected in gene DLG1, is exonic located (rs74674649, P = 6.0 × 10 −4 ), and the promoter region of the gene also harbors a significant hypomethylation status (P = 0.0288). This gene is described as being exclusively located in the postsynaptic density of neurons, and is crucially involved in anchoring postsynaptic membrane proteins.

Genesymbol Chr Strand
A similar approach was performed for all FTD cases but here we extracted SNPs with unadjusted P < 0.05 using the summery statistics of the FTD-GWAS (instead of FTD-ALS). Positional mapping of SNPs using ANNOVAR revealed 3,662 genes. We detected 4 overlapping genes (P = 0.0553, Table S3) between the 14 DMP genes and 3,662 GWAS genes. One out of the four genes; ELP2, contained a SNP (rs16967474, P = 0.0322) that was exonic located, being nonsynonymous, and with CADD-score of 25.3. Interestingly the ELP2 gene was recently found implicated in neurodevelopmental disabilities 30 . To summarize, we here isolated potentially functionally relevant genes for FTD, particularly for the FTD-ALS subtype, based on the combination of both genetic and epigenetic profiles.

Biological processes are affected by both genetic and epigenetic aberrations.
To assess whether biological mechanisms are affected in FTD-ALS, either due to differences in DNA-methylation levels (n = 224 genes) or due to genetic architecture (n = 4,147 genes), we performed a pathway analysis on the 224 genes, and separately 4,147 genes. We next analyzed the overlap of pathways. Note that we did not detect significant enrichment of pathways for the 14 unique markers in FTD by means of the hypergeometric test.
Pathway analysis was performed by using gene sets with a described function in brain and/or neurological development, and were derived from the molecular signature database (MsigDB v5.1 31 , see methods section for more details, such as the number of pathways that were tested). The 224 DMP genes for FTD-ALS revealed three significantly enriched pathways (P BH < 0.05, Fig. 3A), namely: Reactome Neuronal System (P BH = 0.005), Lastowska Neuroblastoma Copy Number DN (P BH = 0.0256), and Meissner brain HCP with H3K4me3/ H3K27me3 (P BH = 0.0182). Separately, we performed a pathway analysis for the 4,147 unique genes derived from the FTD-ALS GWAS, which resulted in 44 enriched pathways (P BH < 0.05, Fig. 3A). Two of three pathways overlapped, i.e., Meissner brain HCP with H3K4me3/H3K27me3 (P BH = 6.82 × 10 −11 ), and Lastowska Neuroblastoma Copy Number DN (P BH = 8.45 × 10 −4 ). The histone modification H3K4me3/H3K27me3 gene set was previously implicated in various neurological phenotypes and psychiatric disorders 32 , whereas the neuroblastoma pathway points to genes with copy-number losses in primary neuroblastoma tumors for which neuroblastoma cell lines were also used as a model-system for FTD 33,34 .
Interestingly, the two common pathways showed different overlapping genes (Fig. 3B), indicating that different genes are implicated from the genetic and epigenetic perspective but are located in the same pathway. As an example, the histone modification H3K4me3 gene set contains 1070 genes with only a joint overlap of six genes . Co-expression network for FTD-ALS group. Continuous gene expression data from GTEx (n = 313 samples over 13 brain regions) was used to build a co-expression network using the genes that are marked being significantly differential DNA-methylated in FTD-ALS group, and with significant pairwise correlation (|r| > 0.6 and P < 0.001). Node color depicts DNA-methylation status of the gene; red color depicts DNA hypomethylation (T-statistics < 0), and blue color depicts DNA hypermethylation (T-statistics > 0). Node size and text label depicts CADD score for the associated SNP (larger node size depicts a relatively more deleterious variant, and genes without a SNP have equal small node size). Yellow colored text labels depict a SNP that associated with the gene from GWAS FTD-ALS, whereas a black color depicts no SNP. Edges with positive correlation are indicated in red, whereas negative correlations are indicated in blue. Thickness of edges is based on the absolute correlation measure, which varies between 0.6 and 1.
between the genetic and epigenetic markers (Fig. 3C). Similarly, the Lastowska Neuroblastoma Copy Number DN gene set contains 801 genes with a joint overlap of two genes (Fig. 3D).

DNA-methylated genes involved in FTD-ALS are highly co-expressed in normal brain function.
To analyze the mediating role of DNA-methylation on the signaling cascade in FTD-ALS, we constructed a co-expression network (pairwise Spearman correlations) between the continuous mRNA expression levels using data from the GTEx consortium (see methods section for more details). The co-expression network contained 150 genes (out of the 224 genes) with minimum correlation of |r| > 0.6 and significant pairwise interactions P < 0.001 (Fig. 4).
In the co-expression network topology, we overlaid: (1) DNA-methylation status of FTD-ALS cases (node color); (2) The detected SNPs from GWAS FTD-ALS cases (marked with yellow colored gene label), and; (3) The associated CADD-score (node size). To get a notion of the functional importance of a gene, we used the gene-degree in the co-expression network (number of edges the gene contains) as higher regulators may have more co-expressed genes. We used gene-degree in the co-expression network to further prioritize the candidate gene-list (Table 1, Fig. 4). We detected that Immunoglobulin Mu Binding Protein 2 (IGHMBP2) was one of the genes with highest degree (31) that also contained a deleterious stop-gain mutation (CADD score: 22.8). Interestingly, this gene is associated with the disease distal hereditary motor neuropathy type 6, where motor neurons degenerate selectively in the anterior horn of the spinal cord. The full list of gene-degrees is listed in Table S4. DNA-methylation levels for GRN, MAPT, and C9orf72. Besides analyzing the methylation profiles from a genome-wide perspective, we also analyzed separately the probes associated with the three known genetic markers of FTD, i.e., GRN, MAPT, and C9orf72.
The promoter of GRN has previously been demonstrated to be hypermethylated 11 . In our data set, 12 GRN probes were available for which one probe (cg17101358, located at 5′UTR/1stExon) resulted in borderline significant differences in DNA-methylation levels (P BH = 0.059) in FTD, (compared to the control group with Student T-test). No significant difference in DNA-methylation levels were detected for the FTD-ALS group. The gene MAPT contained one probe but without significant differences in DNA-methylation levels for both FTD, and FTD-ALS cases, which is in line with current literature 35,36 . Analysis of the four C9orf72 probes (5′UTR, TSS200, and two in TSS1500) did also not result in significant differences in DNA-methylation for FTD, nor FTD-ALS cases. Note that C9orf72 has previously been identified with DNA hypermethylation in the promoter region when performing a single-gene promoter analysis 37 .
Validation by meta-analysis of gene transcript levels. We sought replication to examine the validity of the detected genes that reached genome-wide significance in the primary analyses. Since there are no independent DNA-methylation profiles for FTD or FTD-ALS, we used gene transcript levels of samples with FTD, and separately Amyotrophic Lateral Sclerosis cases (ALS), which is similar to ALS in FTD-ALS. The mediating role of DNA-methylation on the transcript level is well established, and therefore we hypothesized that similar affected genes should be evident from our study. We included four independent studies from Gene Expression Omnibus (GEO) that we considered the most suitable for validation. We analyzed these data sets in a meta-analysis (see materials and methods), where we ranked the DNA-methylated genes, implicated in FTD or FTD-ALS, based on the overlap with the significantly differential expressed genes across the seven validation data sets.
To determine the significantly differential expressed genes across the validation data sets, we performed an unbiased test by comparing the gene expression levels of cases versus controls using Limma. Note that we multiple test corrected for the number of probes that were present per study as described in Materials and methods section. All validation data sets, except one (#4), resulted in significantly differential expressed genes (P BH < 0.05, Tables S5 and S6).
For FTD-ALS, 60 out of 224 genes could be validated in total (Fig. 5, Table S5) from which 5 genes were seen across two validation sets; CCND2, PCNX, PTP4A2, METTL7A, and PALLD. To further specify potential candidate genes that are implicated in FTD-ALS, we only included genes with aberrant DNA-methylation and deleterious SNPs, and detected 9 genes (Table 1, and Fig. 5). For FTD we detected one gene, namely ELP2/SLC39A6 (Table S1). Besides the validation of single genes, we also emphasized the relevance of our DNA-methylated gene Figure 5. Validated genes by independents gene expression data sets. Significantly differential expressed genes across six independent studies overlaid with the aberrant DNA-methylated genes in FTD-ALS. Grey squares depict overlap of genes between FTD-ALS markers and one or multiple validation data set. Genes that could not be validated were removed from the plot. The SNP row depicts genes that were also discovered with a deleterious SNP. set of FTD-ALS by the detection of significant overrepresentation of genes across two validation data sets (#1, and #2, Fisher exact test, P < 0.05, Table S6). No significant results were seen for FTD.

Discussion
In this study, we investigated the DNA-methylation profiles (DMPs) of cases with FTD to detect genes affected by epigenetic biological mechanisms that may play a role in neurodegeneration. The first aim in this study was to explore the separation of FTD clinical subtypes using the DNA-methylation profiles for which we could demonstrate a clear separation of the FTD-ALS subtype. The second aim was to detect genetic variants and/or epigenetic changes that show associations with FTD and/or FTD-ALS. Ideally the candidate genes should be validated with bisulfite pyrosequencing or using an independent DNA-methylation cohort of FTD cases but such a data set does not exist in the public domain. We aimed to validate our results by using multiple independent gene expression data sets. The validates genes have thus increased susceptibility for abnormal gene-transcript behavior, harbor risk-SNPs, and display abnormal DNA-methylation levels, and many are annotated with function in brain and/ or neurodevelopment.
Depending on the follow-up steps, the gene-list can be further narrowed by specific ordering, e.g., based on SNP association, DNA-methylation status, degree of co-expression, or even by its role in specific pathways. As an example, synapse-associated gene DLG1 contains the most significant SNP association followed by KIAA1147 which is suggested to have a role in neurogenesis and neuronal recovery and/or restructuring in the hippocampus following transient cerebral ischemia 38 . For the validated genes with hypomethylation status, we identified gene KIAA1147, and gene IGHMBP2 among others. The latter gene is described with distal hereditary motor neuronopathy type 6, which selectively degenerates motor neurons in the anterior horn of the spinal cord, and reported with a role in development of adult human brain, and motor neurons 39 . Prioritization based on the co-expression networks placed gene IGHMBP2, and PCNX as the top genes. Notably, genes without a deleterious SNP can also be of interest and ordered based on degree of co-expression. An example is gene GPR176 (degree = 32) which is involved in responses to hormones, growth factors, and neurotransmitters 40 , whereas gene ATXN7L1 (degree = 32) showed functional relation to brain based on the Human Integrated Protein Expression Database (HIPED). Another gene of interest with DNA hypermethylation status is COL15A1, which is previously reported with downregulated expression levels in iPSC-derived ALS motor neurons 41,42 . Our results are in line with these findings as the hypermethylation in the promoter region of COL15A1 can be indicative for the down-regulation of transcript levels.
We showed the possibility of detecting novel SNPs (and genes) that do not reach genome-wide statistical significance using conventional GWAS approaches but may confer an increase in risk of disease development. A crucial step in our approach was to relax the traditional GWAS P-value threshold (which is P < 5 × 10 −8 ), which we confidently could do because the P-value describes the association with the (SNP) genotype, and not the gene function. Thus, a relatively small phenotypic effect for a SNP can still have large effect on the gene level, particularly, through the presence of deleterious variant(s) in the coding region (as shown in the current work). The effect of such variant(s) might be exacerbated by the presence of aberrant overexpression due to DNA hypomethylation. Conversely, the expression of genes required for normal neurological function is lacking or may be silenced as the transcription is suppressed by DNA hypermethylation. Therefore, we hypothesized that by employing a double-hit model, potential novel targets for brain/neurological functions can be detected. A disadvantage of relaxing the P-value threshold is that we may have detected false positive associations with the phenotype. To overcome this, we took various steps to remove genes that are annotated as being spurious 43 , we focused only on the deleterious SNPs that are present in coding regions, and we incorporated the DNA-methylation profiles of the FTD cases. All together we could demonstrate a significant number of genes that harbor both risk SNPs and significant differences in DNA-methylation levels. This indicates non-random behavior of genes that are target in both FTD and FTD-ALS.
For our third aim, we examined whether genetic and epigenetic changes for FTD and/or FTD-ALS may be both present in specific biological processes. One of the pathways that we detected in FTD-ALS with both genetic and epigenetic changes are histone modifications H3k4me3 and H3k27me3, which were previously described to be associated in neurological functions 32 , and involved in social exclusion 44 by examining liver tissue in mice. Thus overall, evidence is pointing to histone modifications and the association with neurological function. In that perspective, we also demonstrate that this particular pathway is affected in cases with FTD-ALS for both the genetic (SNPs) and epigenetic profiles (DMP). The histone modifications changes are of interest because of their regulation by DNA methyltransferase, such as DNMT3A/B 45,46 , and subsequently for usage of DNMT inhibitor (DNMTi) therapies. The DNMTi targets include azacitidine, and decitabine which are FDA approved for use in leukemia 47 . For neurodegenerative diseases, it may also provide a handle for therapy because cytosine methylation can be targets for DNMTi to reverse the methylation status. A potential candidate gene that we detected can for example be gene COL15A1 48 but this would first require independent replication/validation.
For FTD, single-gene DNA-methylation promoter analysis was performed previously for MAPT, GRN, and C9orf72. For MAPT, no significant differences in DNA-methylation levels were previously seen 36 , whereas both GRN and C9orf72 were shown to contain DNA hypermethylation in the promoter region 11,37 . We expected to see similar results in our analysis but genome-wide DNA-methylation analysis revealed no significance for the probes associated with genes these three genes. A reason for such discrepancy could be that DNA-methylation occurs in specific promoter regions that do not overlap with the Infinium HumanMethylation450 BeadChip probes, which is true for MAPT and GRN (Table S7).
Our analyses are based on the assumption that the use of DMPs measured in blood is a proxy for DMPs in brain. We carefully examined the proxy, and demonstrate that differential expressed genes in blood, liver, and brain tissue significantly overlapped with the differential expressed genes that are also relevant to FTD-ALS. Although for neurodegenerative diseases, brain would be the preferential tissue to investigate DNA-methylation profiles in, the use of peripheral blood might to some extent overcome this issue as we showed that a significant number of genes with differentially DMPs in the blood are also important for molecular processes in brain. Nonetheless, the use of peripheral blood to analyze DNA-methylation profiles as a model for brain tissue requires caution. Besides the use of blood, other tissues, such as liver, also showed to be representative to examine neurological function as shown in mice 44 , and is in line with our findings.
The DMP data used in this study originates from Li, Y. et al. 19 , but we focused specifically on the FTD cases (and not PSP), for which we integratively analyzed the epigenetic and genetic status of genes. In addition, we combined the two batches of samples after batch-correction normalization. This allowed unsupervised analysis using all samples together, and the increased number of samples provided increased statistical power to detect differential methylated genes. Overall, the differential methylated genes from our analysis are in line with those previously detected using the batches separately and in the meta-analysis 19 (P overlap gene set-1: 0.0298, P overlap gene set-2: 0.0726, and P overlap combined meta-analysis: 0.0073, Fig. S5). Interesting to note is that we detected for the FTD-ALS group in total 224 differential expressed genes, whereas the FTD cases showed only 14 genes, compared to the controls. To accommodate co-variates responsible for changes in methylation that are unrelated to FTD, we analyzed an additional control set of DNA-methylation profiles (GSE53045, Fig. S4A,B) as an alternative approach. We compared the DNA methylated profiles of the controls in the FTD cohort versus the independent control group (non-smokers), which did not yield significance of probes (Fig. S4C). In addition, we compared FTD vs. Controls together with the non-smoker group which resulted in 34 differential DNA methylated probes (Table S8, Fig. S4D). Using this extend control data set, we were able to rule out 2 genes that we initially found to be differential DNA methylated. Note that we already removed these two genes in our final results as the genes were not supported by our incorporated data sources.
The joint analysis and integration of multiple omic data sets is key to further analyze complex neurodegenerative diseases such as FTD. Although our results are based on unpaired samples, by combining genetic and epigenetic data we revealed novel candidate neurodegenerative genes and pathways. Further detailing the biological mechanisms involved in progressive degeneration of the temporal and frontal lobes of the brain requires a well characterized FTD cohort containing clinical, pathological and molecular information for which multi-omic data is obtained for the same samples. With the current work, we showed that both genetic and epigenetic data are useful to start unraveling neurodegenerative processes in FTD.

GWAS data set.
In this study, we used the GWAS summary statistics of 2,154 patients with FTD and separately 200 patients with FTD-ALS 4 . For further analyses, SNPs were retained with unadjusted P-value < 0.05 based on the complete FTD cohort and separately for the FTD-ALS cases. SNPs were annotated using ANNOVAR 29 , considered deleterious with CADD-score 49 >15, and spurious genes were removed 43 .  19 . This cohort contains in total 128 FTD cases, of which 118 cases were described with C9orf72 negative status, and 10 cases with a repeat expansion. Seven cases were diagnosed with Amyotrophic Lateral Sclerosis (FTD-ALS) of which 3 cases were C9orf72 expansion carriers. There were no other reported pathogenic variants in any genes that were screened, including MAPT and GRN. Prior to making the comparison between FTD cases and controls, we normalized and processed the DNA-methylation beta values to remove technical biases and irrelevant probes (as described below), allowing us to combine the two batches of samples from the original study, instead of performing a meta-analysis by analyzing both batches separately 19 .

DNA-methylation
The DNA-methylation profiles contained 485,577 probes over 23,179 genes, which were annotated using official Infinium HumanMethylation450 BeadChip annotations. The software package Combat 50 was used to remove batch effects, allowing us to combine all samples for further analysis instead of performing meta-analysis as previously described 19 . Furthermore, we removed probes that contained > 20% missing values based on all samples. We removed probes that are located on the X and Y chromosome to avoid gender related biases. Furthermore, we removed probes that contain SNPs with MAF > 0.1 (derived from the dbSNP137) as the detection of SNPS that are common in the population can affect DNA-methylation levels and are more likely associated with e.g., ethnicity 51 instead of disease phenotype. We also removed so-called control probes, and probes that are marked as being spurious 52 . Furthermore, we retained only probes located in close proximity of the annotated gene, i.e., TSS1500, TSS200, 5UTR, 1 st Exon, Body, or 3′UTR (based on original Infinium HumanMethylation450 BeadChip annotations). Probes that contained missing values were imputed using the K = 3 nearest neighbor approach. Beta values were zero-mean normalized, i.e., DNA hypermethylation is depicted with relative values above 0 and DNA hypomethylation is depicted with relative values below 0. The final set contained 214,170 probes over 20,956 genes. Currently, various pipelines and packages for Infinium HumanMethylation450 BeadChip processing are developed that can be used for data pre-processing 53 .
Gene-expression validation data sets. Dataset   Tissue-type association. RNA sequencing data, with the expression levels of 16,115 genes, from 1,641 tissue samples over 25 unique tissue types was derived from the GTEx consortium 26,27 . To determine tissue enrichment with the DNA-methylated genes, we followed the procedure as outlined in Fig. 2A. Step 1: for each of the 25 tissue types we tested for differential gene expression between samples within a tissue versus all other tissue samples.
Step 2: significantly differentially expressed genes for each tissue type were selected when the absolute Fold-difference > 1.5, and the P-value of the Students T-test was ≤ 0.05 after correcting for multiple testing using the Benjamini and Hochberg method.
Step 3: the hypergeometric test was applied to determine the significance in overlap between the tissue-type-genes and the DNA-methylated genes in FTD(/ALS) based on the following parameters; total number of genes from GTEx consortium (M = 16,115), number of tissue specific genes (K), number of significant differentially methylated genes (N), and the overlap of significant differentially methylated genes and the genes in the tissue specific gene set (x). The adjusted P-value (P*) with < 0.05 was used for tissue selection.
The same procedure was applied for the BrainSpan 28 data to determine brain-tissue enrichment based on the RNA-sequencing data of 525 samples across 26 brain regions, DNA-methylation data of 177 samples over 17 brain regions, and by using 22 pre-defined gene sets. The pre-defined gene sets describe genes with known function across the various brain regions, and are derived from the official BrainSpan website. As a background, we used the total number of unique genes from Brainspan (RNA-sequencing M = 18,107, and DNA-methylation M = 23,093).
Pathway/gene set analysis. We utilized the following pathways and gene sets from the molecular signature database (MsigDB v5.1) 31 : chemical and genetic perturbations (n = 3,396), Biocarta genesets (n = 217), KEGG genesets (n = 186), Canonical pathways (n = 1,330), Gene ontology Biological Processes (GO, n = 825), Gene ontology Cellular Components (GO, n = 233), Gene ontology Molecular Function (GO, n = 396), Oncogenic signatures(n = 189), and Immunologic signatures(n = 4,872). To lower the computational burden, we selected a priori for pathways/gene sets with brain or neurological function. Using the hypergeometric test, we calculated a P-value for the fraction of genes that overlapped with the annotated pathways/gene sets. A pathway was considered statistically significant when the P-value from the hypergeometric test ≤ 0.05 after correcting for multiple testing using the Benjamini and Hochberg method. As a background, we used the M = 25,318 genes from UCSC HG19.
Co-expression network. The co-expression network is constructed based on pairwise Spearman correlations between the continuous mRNA expression levels using gene expression profiles of the GTEx consortium. For FTD-ALS we started out with the 224 genes and retained 150 genes that overlapped with genes from the GTEx consortium, and that showed a minimum absolute correlation of |r| > 0.6, and significant pairwise interactions P < 0.001. Edges with positive correlations are indicated in red (r > 0.6), whereas negative correlations are indicated in blue (r < 0.6). Thickness of edges is based on the absolute correlation measure, |r|, which varies between 0.6 and 1. The gene-degree is determined by the number of edges a gene contains in the co-expression network.