Differential methylation analysis in neuropathologically confirmed dementia with Lewy bodies

Dementia with Lewy bodies (DLB) is a common form of dementia in the elderly population. We performed genome-wide DNA methylation mapping of cerebellar tissue from pathologically confirmed DLB cases and controls to study the epigenetic profile of this understudied disease. After quality control filtering, 728,197 CpG-sites in 278 cases and 172 controls were available for the analysis. We undertook an epigenome-wide association study, which found a differential methylation signature in DLB cases. Our analysis identified seven differentially methylated probes and three regions associated with DLB. The most significant CpGs were located in ARSB (cg16086807), LINC00173 (cg18800161), and MGRN1 (cg16250093). Functional enrichment evaluations found widespread epigenetic dysregulation in genes associated with neuron-to-neuron synapse, postsynaptic specialization, postsynaptic density, and CTCF-mediated synaptic plasticity. In conclusion, our study highlights the potential importance of epigenetic alterations in the pathogenesis of DLB and provides insights into the modified genes, regions and pathways that may guide therapeutic developments.


D
ementia with Lewy bodies (DLB) is a heterogeneous neurodegenerative disease characterized by parkinsonism, visual hallucinations, fluctuating mental status, and REMsleep behavior disorder 1 .There are an estimated 1.4 million cases living in the United States 2 , and current therapy is limited to symptomatic and supportive care.Although genetic research studies have identified heritable factors that are important in the etiology of this understudied disease 3,4 , the molecular causes remain poorly understood, and little is known about non-genetic contributors to its pathogenesis.
Epigenetic changes are modifications to the DNA that regulate gene expression.These modifications are influenced by aging, the environment, lifestyle, disease state, and other factors.They allow cells to respond dynamically to the outside world and are considered the interface between genetic and environmental components.One type of epigenetic alteration is DNA methylation.This tissue-specific mechanism occurs when a methyl group is transferred onto a cytosine, mostly occurring in the context of a cytosine-phosphate-guanine dinucleotide sequence (CpG) in higher eukaryotes.These modifications change the DNA accessibility to the transcriptional machinery complex and finemodulate gene expression.Importantly, epigenetic modifications are thought to play a prominent role in age-associated neurological diseases, such as Alzheimer's disease and Parkinson's disease [5][6][7] .Evidence is also emerging that DNA methylation changes influence the risk of developing DLB 4,[8][9][10] .
In this study, we investigated the role of DNA methylation in the pathogenesis of DLB.We performed an epigenome-wide association study (EWAS) to characterize the differential methylation patterns in cerebellar tissue obtained from 298 pathologically confirmed DLB cases and 203 neurologically healthy controls.We further performed gene-region and pathway analyses, demonstrating widespread epigenetic changes associated with this fatal neurodegenerative disease.

Results
Epigenome-wide association study design.We performed an EWAS using cerebellar brain tissue obtained from patients diagnosed with DLB and healthy individuals.We profiled their DNA methylation status with the Illumina MethylationEPIC arrays.After quality control filters were applied, 728,197 sites were tested for association with DLB in 278 cases and 172 controls.We then designed the regression model adjusting the data by age, sex, experimental batch, five principal components from genotyping data, cell type proportion, and 44 surrogate variables from methylation data.
Among the DMP-associated genes, three have been previously implicated in neurodegenerative or neurodevelopmental disorders: ARSB, encoding for the Arylsulfatase B, has been associated with Alzheimer's disease and Parkinson's disease; 11,12 MGRN1, encoding Mahogunin Ring Finger 1, has been implicated in lateonset spongiform neurodegeneration; 13 and IQSEC1, encoding for the IQ Motif And Sec7 Domain ArfGEF 1, has been associated with a neurodevelopmental disorder 14 .
Our data showed modest inflation (lambda = 1.19).To further explore inflation, we also corrected the p values using the Bioconductor package bacon (lambda = 1.02).This approach replicated the results of the main study, where all seven DMPs reached the FDR threshold and the top two probes surpassed the genome-wide significance threshold based on Bonferroni correction (Supplementary Fig. 1).

Identification of differentially methylated genomic regions.
Recent research has shown that differentially methylated regions (DMRs) are more highly associated with diseases than differential methylation at individual CpG sites alone 19 .For this reason, we examined DMRs in our case-control dataset.Our analysis identified 32 CpG sites that clustered in three different DMRs characterized by a hypermethylation signature in DLB cases compared to controls (Table 2).For example, we identified a DMR overlapping the promoter region of DHRS4 and its antisense lncRNA DHRS4-AS1 (adjusted p value = 1.59E-10, mean Δβ = 0.022).DHRS4 has been recently suggested as a novel risk gene for inducing neurodegeneration in mouse models of amyotrophic lateral sclerosis 20 .
Functional enrichment analysis determined pathways associated with DLB.We performed a pathway enrichment analysis of differentially methylated genes in our case-control cohort.Specifically, we investigated the 43 genes that overlapped FDR-significant differentially methylated probes.This analysis detected eight biological processes, cellular components, and molecular functions that were significantly associated with DLB.Among the Gene Ontology (GO) terms, we identified an association with the terms "regulation of response to stimulus" (Bonferroni-adjusted p value = 0.0237), "neuron to neuron synapse" (adjusted p value = 0.0494), "vesicle" (adjusted p value = 0.0489), "postsynaptic specialization" (adjusted p value = 0.0385), and "postsynaptic density" (adjusted p value = 0.0276) (Fig. 4).Of note, 29 out of 44 genes showed an interaction with the CTCF protein, a transcription factor that has been associated with Alzheimer's disease and synaptic organization (adjusted p value = 0.0011) (Table 3) 21,22 .

Discussion
Our analyses illustrate the value of EWAS in unraveling the multiplex architectures of neurodegenerative conditions and highlight the potential contributions of methylation to the pathogenesis of DLB.We identified several probes that were differentially methylated in the DLB cases (Fig. 1).Interestingly, many of these genes are highly expressed in the brain (Fig. S2) and have been previously implicated in neurological disorders or central nervous system development.
Chief among the loci identified by our study was ARSB, encoding for Arylsulfatase B, a member of the sulfatase family that removes sulfate groups from chondroitin-4-sulfate, triggering its degradation.This lysosomal enzyme is involved in cell adhesion, migration, and invasion in colonic epithelium 23 .In cultured astrocytes, ARSB silencing increases chondroitin-4-sulfate and neurocan levels, and inhibits astrocyte-mediated neurite outgrowth, suggesting that ARSB may play an important role in neuronal plasticity in the central nervous system 24 .Homozygous or compound heterozygous mutations of the gene lead to Mucopolysaccharidosis type VI, a lysosomal storage disorder characterized by skeletal anomalies, short stature, and cardiac abnormalities 25 .Our data complement previous studies  implicating ARSB variants in Alzheimer's disease and Parkinson's disease 11,12 , expanding the pathogenic role of the gene in neurodegeneration.DLB affects males more than females 26 , but little is known about sex-specific contributions to the pathogenesis.Our sexspecific EWAS showed a male-driven effect, suggesting that epigenetic modifications could modulate the risk in males and females separately.However, our analysis did not include sex chromosomes, and future studies exploring epigenetic changes in these chromosomes are needed to draw conclusions.
Interestingly, our study also showed an enrichment of genes that regulate the neuron-to-neuron synapse, postsynaptic specialization, and postsynaptic density in DLB.Moreover, we highlighted the transcriptional repressor CTCF as a key factor able to orchestrate these biological processes.This protein is actively involved in maintaining the three-dimensional structure of the chromatin, creating topologically associated functional domains within the nucleus 27 .Our data suggest that the chromatin architecture could play a role in the pathogenesis of DLB, and support recent evidence implicating CTCF-mediated synaptic plasticity in Alzheimer's disease 22 .These observations expand the role of CTCF and synaptic organization-related genes in neurodegeneration.
Our data corroborate emerging evidence implicating aberrant epigenetic modulation in neurodegeneration 28,29 .Only a few studies have investigated the epigenetic changes associated with DLB, and they differ in sample size, targeted tissue, and study design 8,10,30 .For example, Shao and colleagues performed an EWAS of the Brodman area 7 of the brain in a cohort consisting of fifteen pathologically confirmed DLB cases and sixteen neurologically healthy controls.Despite the limited sample size and the different brain regions investigated, their study design does represent the closest structure to our EWAS.In contrast, Nasamran and collaborators profiled blood epigenetic modulations comparing 42 DLB patients and 50 Parkinson's disease dementia cases, while Pihlstrom and colleagues explored the epigenetic modulations associated with different Braak Lewy body disease stages in 322 Parkinson's disease and DLB cases.Our study identified none of the DMPs or DMRs previously associated with DLB (Supplementary Data 3).However, the outlined differences in the design of these studies may account for these discrepancies.
Significant changes in methylation have also been observed in Alzheimer's disease and Parkinson's disease [31][32][33] .Sharma and collaborators showed that epigenetic modifications across single nucleotide polymorphisms, located within the first intron of the SNCA gene, modulate the susceptibility to Parkinson's disease.A meta-analysis of 1,453 individuals with Alzheimer's disease, investigating the epigenetic changes associated with Braak neurofibrillary tangle stage, identified differentially methylated sites and regions in the prefrontal cortex, temporal gyrus, and entorhinal cortex, but not in the cerebellum.Interestingly, our EWAS replicated some of the findings from Smith's study, identifying epigenetic modification involving a common CpG site within FRMD4A (cg03775372) and a common DMP-associated gene, MKL2.This finding raises the scientific interest surrounding cg03775372 and the FRMD4A gene, since the same CpG reached the FDR significance in our main study and achieved genomewide significance in the sex-specific EWAS.
We selected cerebellar tissue for our research as it is relatively spared in the terminal stages of DLB, unlike cortical tissue in which most cells of interest have been lost to neurodegeneration.Selecting a relatively spared tissue source provides a more accurate window into the epigenetic plasticity of a disease.This detail needs to be weighed against the fact that the disease-relevant changes are likely more prominent and representative in the regions primarily affected by the disease.Sampling multiple regions for comparison would have been ideal.Future efforts will likely increase our ability to identify disease-associated methylation patterns across the brain, and we have made the EWAS results from our study publicly available to facilitate this unfolding research.
Our EWAS has several limitations.The Illumina Methylatio-nEPIC array contains only a fraction of the human genome's CpGs.Furthermore, the bisulfite conversion chemistry does not distinguish between 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC), a brain-specific intermediate product of 5mC demethylation 34 .We also did not adjust the data for environmental factors that may impact the methylation status, such as vascular comorbidities, smoking, and alcohol use.
The minfi-approach that we used to estimate the cell type proportion was able to discriminate between neuronal and nonneuronal cells, using the frontal cortex region as reference.This method is based on the neuronal-specific protein NeuN, expressed in vast majority of neurons, though Purkinje cells represent an exception.Our approach has already been successfully applied in EWAS based on DNA from cerebellar tissue 35,36 .Moreover, exploring the surrogate variables, we identified a mild negative correlation between NeuN-negative cell proportion and the surrogate variables 1 and 3 (Pearson correlation = −0.64 and −0.61, respectively), and a positive correlation between NeuNpositive cell proportion and surrogate variable 3 (Pearson correlation = 0.83).These data suggest that the surrogate variable analysis accounts for at least some of the variability due to the different cell types within a tissue.As such, it represents a valuable approach that could be employed in similar instances where there is no tool to estimate the cell proportion in a tissue (Supplementary Fig. 3).
Genome-wide association studies are affected by inflation and EWAS are not an exception, so exploring inflation is crucial to reduce the number of unreliable results.Our data showed moderate inflation (lambda = 1.19) (Fig. 1b).We further reduced genomic inflation using the R package bacon (lambda = 1.02,Supplementary Fig. 1), and we were able to replicate the results of the main EWAS.To consolidate our main results, we also performed the EWAS using the MLM-based omic association (MOA) tool from OSCA 37 .This tool represents a stringent approach to processing DNA methylation data.Not surprisingly, therefore, inflation was drastically reduced when this tool was applied (lambda = 1.005), but none of the probes surpassed the genome-wide significance threshold (Supplementary Fig. 4).However, the top 48 CpG sites identified in our study showed a consistent pattern when comparing the two approaches (Pearson correlation = 0.87789), and the directions of their modulations were coherent (Supplementary Fig. 5).
Sample size differences between the male-and the femalespecific cohorts may represent a bias of the study, particularly for the sex-specific EWAS.Furthermore, the absence of matching for age and sex between the patients and controls might have led to an overestimation of the contribution of the identified loci in the pathogenesis of DLB.However, this effect was probably mitigated by including age and sex as covariates in the association model 38 .The use of post-mortem tissues cannot discriminate between causal effects and the downstream  consequences of the DNA changes observed.Finally, we detected only modest effects on DNA methylation, though this is consistent with previous EWAS efforts investigating neurodegenerative disorders 8,10,30 .
In summary, we investigated the differential methylation signature of a large cohort of patients with pathologically confirmed DLB.We delineated clear epigenetic modulation associated with this common form of neurodegeneration, defined differentially methylated probes and regions, and highlighted new loci and biological pathways affected by these changes.In particular, we provide evidence implicating the ARSB gene and the CTCFmediated synaptic plasticity to DLB.Our study underlines the potential role that epigenetic modulation plays in the pathogenesis of DLB and represents an opportunity for the future identification of biomarkers and new therapeutic targets.

Methods
Study samples.Frozen cerebellar tissue from 298 DLB patients and 203 neurologically healthy controls were obtained from brain donation programs.The demographic and clinical characteristics of the study participants are summarized in Supplementary Table 1.The DLB patients were diagnosed with pathologically definite disease (limbic or neocortical subtype) according to the McKeith consensus criteria 1 .Neurologically healthy controls were selected based on the absence of neurological disease in their clinical history and the absence of neurodegenerative disease on pathological examination.All participants were of European ancestry.Informed consent for post mortem brain tissue donation was obtained from all subjects or their surrogate decision makers according to the Declaration of Helsinki.Each brain donation program was approved by its own institutional ethics committee.These convenience control samples were obtained from the same brain banks as the DLB cases and were of European ancestry.The controls were not specifically matched for age or sex, however, age and sex distributions among cases and controls were comparable (Supplementary Fig. 6, Supplementary Table 1).
Quality checks and filtering were performed to exclude lowquality samples and probes (summarized in Fig. S7).Samples meeting the following criteria were excluded: (1) low overall quality based on sample-dependent and sample-independent control probes and methylated/unmethylated probes ratio (using default settings in MethylAid), (2) bisulfite conversion rate < 80.0%, (3) mismatch between reported sex and genotypic sex, (4) mean detection p value > 0.01, (5) samples that were flagged as outliers using the default setting in wateRmelon, and (6) samples that had > 1.0% of probes with a detection p value > 0.05.
We then performed a principal component analysis of the Infinium MethylationEPIC array control probes.Data normalization was done using the R package minfi (preprocessFunnorm function), with 24 principal components (explaining 99.0% of variance) 43 .Following normalization, we excluded probes that had: (1) detection p value < 0.01, (2) CpG sites containing SNPs of any minor allele frequency, (3) probes located on sex chromosomes, and (4) any Illumina methylation array 450 K and EPIC850K cross-reactive probes 44 .
We generated beta-and M-values using the minfi functions getBeta and getM, respectively.Significant surrogate variables have been generated from M-values through the sva package (v.3.46.0) in R 45 .We estimated the proportion of neuronal and non-neuronal cell types using the minfi package (estimateCell-Counts function) 35,36 .
EWAS analysis in DLB.The DNA methylation status of each CpG site (as measured by ß-values, Fig. S8) was tested for association with DLB using the Bioconductor package limma (v.3.46.0) 42 in R (v.4.0.5).In contrast to GWAS, there are no common guidelines established in the field for conducting EWAS analyses.We performed our study using beta values since they directly represent the proportion of methylated CpG sites, making it easier to relate the values to biological processes.Supporting this approach, beta values and M-values showed highly correlated outputs in our analysis (Pearson correlation of p values = 0.904), and the top 48 probes showed a consistent direction of effect between the two approaches (Fig. S9).Age, sex, experimental batch, the first five principal components (generated from the Infinium Global Diversity Array + Neuro Booster genotyping data to account for population stratification), NeuN-positive/NeuNnegative cell type proportion (minfi), and all significant surrogate variables (n = 44) (Bioconductor package sva, v.4.3) were included as covariates in the linear regression model.A two-sided p value and a Δβ-value was calculated for each CpG site.We used the Illumina EPIC annotation R package (IlluminaHumanMethyla-tionEPICanno.ilm10b4.hg19)to define the overlapping genes.Four additional genes were manually annotated since the probes overlapped a gene (IQSEC1: cg06951630, LINC01158:cg07171538, and LINC00856:cg05757757) or mapped within 10 base-pairs upstream of a gene (EPHA8:cg25394625).P values were adjusted by Bonferroni correction.The Bonferroni threshold for declaring a differentially methylated probe (DMP) to be genome-wide significant was 6.87 × 10 -8 ( = 0.05/728,197 markers).We identified sub-significant probes surpassing the False Discovery Rate threshold of 0.05, and we included them in the functional enrichment analysis.To further reduce genomic inflation removing unknown bias, we corrected the p values using the Bioconductor package bacon (v.1.26.0).EWAS evaluations were also performed male (n = 165 cases and 122 controls) and female participants (n = 113 cases and 50 controls) through an interaction model, to assess possible sex-specific epigenetic modulation.Sex was not included as a covariate in those analyses.
The Bioconductor package DMRcate (v.2.4.1) was used with the recommended default settings (lambda = 1000 and C = 2, corresponding to 1 standard deviation of Gaussian kernel each 500 base pairs) to identify and evaluate regions in the DLB data for evidence of differential methylation 43 .This software calculated p values based on smoothed FDR of CpGs within the region.
Statistics and reproducibility.We performed a case-control association study by fitting a linear regression model for each marker using the Bioconductor package limma (v.3.46.0).The topTable function was used to calculate the statistics of differentially methylated probes comparing DLB cases to healthy control subjects, adjusting the p values for multiple testing.Bonferroni-corrected genome-wide significance threshold was set to p < 6.87×10 -8 ( = 0.05/728,197 sites tested).We applied a False Discovery Rate p value correction to declare sub-significant markers.To facilitate reproducible results, we made the analysis code publicly available on Github (https://github.com/pireho/EWAS-Lewy_body_dementia) and https://zenodo.org(DOI: 10.5281/zenodo.10365334).
Reporting summary.Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Fig. 1
Fig. 1 Volcano plot, QQ-plot, and Manhattan plot of EWAS results.Volcano plot (a) showing statistical significance (-log 10 p value) and magnitude of change (delta beta, Δβ) of all CpG sites included in the DLB EWAS analysis.Red and blue dots indicate significantly hypomethylated DMPs and hypermethylated sites in DLB cases compared to controls, respectively.The Bonferroni adjusted p value < 0.05 threshold is shown as a red dashed line.QQ-plot (b) showing the p value distribution and inflation (lambda value).Density plot (c) illustrating the observed p value distribution.Manhattan plot (d) showing the p values of the tested probes across the genome.The genome-wide significance threshold (Bonferroni adjusted p value < 0.05) is shown as a red dashed line, while the orange dashed line represents the FDR threshold.Probes surpassing the genome-wide significance are shown as red dots.

Fig. 2
Fig.2Differentially methylated probes in DLB.The violin plots show the DNA methylation (beta value) distribution (violin shape) in the seven differentially methylated probes (panels a-g).The vertical axis represents the range of values in the dataset, where 0 and 1 mean fully unmethylated and fully methylated respectively.The box plot represents the interquartile range of the dataset (25% bottom, 75% top), the middle line represents the median of the distribution, and the central line shows the value distribution.Black dots represent outliers.The overall difference between DLB cases and controls is shown as delta beta (Δβ); negative and positive values refer to hypomethylation and hypermethylation in DLB cases, respectively (e.g., −0.020 indicates that DLB cases show a 2% decrease in DNA methylation compared to controls).P values refer to Bonferroni corrected p values.

Fig. 3
Fig. 3 Volcano plot, QQ-plot, and Manhattan plot of sex-specific EWAS.Volcano plot a) showing genome-wide significance (-log 10 p value) and magnitude of change (Δβ) of all sites included in the analysis.Negative values (dots on the left side of the volcano plot) indicate hypomethylated DMPs, and hypermethylated sites are displayed as positive values (right-sided dots).The Bonferroni adjusted p value < 0.05 threshold is shown as a red dashed line.QQ-plot (b) showing p values distribution and inflation (lambda value).Density plot (c) illustrating the observed p value distribution.Manhattan plot (d) showing the p value of the probes across the genome.The Bonferroni adjusted p value < 0.05 threshold is shown as a red dashed line.Probes surpassing the genome-wide significance are shown as red dots.

Fig. 4
Fig. 4 Pathway enrichment analysis in pathologically confirmed DLB.Functional enrichment of significant gene ontology pathways for biological processes (GO:BP, orange dot), cellular components (GO:CC, blue dots), and TRANSFAC database (TF, red dots) in pathologically confirmed DLB versus controls.The x-axis shows the p value associated with each pathway on a -log 10 scale and the size of each dot indicates the number of genes involved.

Table 1
Significant differentially methylated probes in the DLB EWAS.Chr.chromosome, Adj.P adjusted p value Differentially methylated probes in the DLB EWAS.Chromosome positions are shown relative to the human reference genome (hg19).Gene names are shown according to UCSC RefGen.The Δβ values refer to the difference between DNA methylation (β-values) in cases compared to controls (e.g., −0.020 indicates that DLB cases show a 2% decrease in DNA methylation compared to controls).Adjusted p value refer to Bonferroni corrected p values.
Table Differentially methylated regions in the DLB EWAS.
Chr. chromosome, No.CpGs number of CpGs per region, Mean Δβ mean probe delta beta.Chromosome positions are shown relative to the human reference genome (hg19).P value refers to smoothed FDR corrected p value.

Table 3
Biological pathways associated with differential methylation in DLB.
No. Genes number of genes, GO Gene Ontology, BP Biological Process, CC Cellular Component, TF TRANSFACT.P value refers to Bonferroni corrected p value.