Introduction

Proximal spinal muscular atrophy (SMA) is an autosomal recessive neuromuscular disorder caused by degeneration of motor neurons in the anterior horns of the spinal cord. SMA is subdivided into four types depending on the age of onset and the severity of symptoms.1, 2 Mutations in the survival motor neuron gene 1 (SMN1) are responsible for all forms of SMA. The SMN1 gene encodes for the ubiquitously expressed SMN protein, which is involved in snRNP biogenesis. SMN appears to fulfill special functions in motor neurons or participates in mechanisms that are critical for their survival.3, 4 Particularly, SMN has been identified in β-actin mRNA transport granules and it might be involved in actin cytoskeleton dynamics via Rho GTPases–ROCK pathway.5, 6 But the precise role of SMN in the degradation of motor neurons is still unknown. The SMN2 gene is an identical centromeric copy of SMN1 and is suggested to be the main modifier of disease manifestation. A strong correlation between the SMN2 gene copy number and the clinical severity of SMA has been confirmed in different populations.7, 8 However, rare cases where siblings with identical SMN1 mutations, identical SMN2 copy numbers show marked clinical discrepancies were reported. As well as observed differences in SMN2 gene copy number between asymptomatic individuals, suggests the existence of additional genetic or epigenetic factors modifying the phenotype.9, 10 Investigation of these modifiers is important for understanding the details of the molecular mechanisms involved in SMA pathogenesis, the development of efficient therapies and identification of new molecular biomarkers. In previous studies, the influence of plastin 3 and profilin IIa proteins level, as well as c.859G>C substitution in the SMN2 gene, on the severity of SMA has been revealed.11, 12, 13 However, these findings can only be applied to a limited number of SMA cases.

DNA methylation was shown to be an important epigenetic modification altering gene expression pattern. Changes of this methylation pattern are associated with various disease processes.14 Several studies have reported differences in DNA methylation level in leukocytes between cancer patients and healthy individuals, supporting the possibility for using this characteristic as a disease marker.15, 16 A differential methylation level of CpG sites near the transcriptional start site of SMN2 was demonstrated between SMA patients of type I and III in DNA samples isolated from leukocyte and fibroblast cell lines.17 However, to date no previous studies have assessed the whole-genome methylation status in SMA patients compared with healthy individuals. Meanwhile, such analysis could be relevant not only to identify new possible SMA modifying factors but also to predict the changes in genomic DNA methylation induced by potential SMA therapy agents, as most of them possess DNA-demethylase activity.17, 18

In this study, we have carried out whole-genome methylation pattern analysis in DNA samples from peripheral whole blood leukocytes from 12 male SMA patients with severe and mild forms of the disease and 11 male healthy individuals of same age.

Materials and methods

Ethics statement

All healthy individuals, adult patients and parents of all children gave written informed consent to the diagnostic procedures. The analysis was approved by the ethics committee at Ott’s Institute of Obstetrics and Gynecology RAMS.

Subjects

In all, 12 patients with SMA and 11 healthy individuals from the North-Western region of Russia were selected for this study (Table 1). Only male subjects were included in this study in order to eliminate gender effects. All SMA patients were tested for the deletion in the SMN1 gene and SMN2 gene copy number as described previously.10

Table 1 Information about SMA patients and control individuals

DNA methylation profiling

The Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA, USA), which allow assessment of the methylation status of 485 577 cytosines distributed over the whole genome,19 was used to determine the methylation profile of genomic DNA. All genomic DNA was isolated from peripheral blood leukocytes by phenol–chloroform extraction.20 Bisulfite conversion of genomic DNA and sample preparation for the platform analysis was performed as described earlier.21 The BeadChip was read on an Illumina iScan scanner and processed using the GenomeStudio 2009.2 software (Illumina).

Data processing and statistical analysis

All downstream data processing and statistical analyses were performed with the statistical software R (www.r-project.org) together with the lumi, limma and IMA packages of the Bioconductor project.22

Two measurements, β-value and M-value, have been proposed to measure the methylation level.19 The β-value indicates the percentage of methylated alleles of a site and is a biologically more intuitive measurement than the M-value, which is a logarithmic ratio between unmethylated and methylated probes.23 However, the M-value has better statistical properties and was used for conducting the differential methylation analysis, although results were reported using β-values.

Data preprocessing

Quality control

The data were imported and submitted to quality control using the IMA.methy450PP function of the IMA package, resulting in the removal of CpG sites with missing β-values, detection P-value <0.05 and sites where <75% of the samples had a detection P-value <10−5. In addition, samples with missing β-values, with average detection P-values >10−05 and samples having <75% of sites with detection P-value <10−5 were also removed at the sample level. Overall 469 977 CpG sites and all samples passed the quality control successfully.

All downstream analyses were performed separately for two groups, first group including the six SMA patients with type I and II of the disease and the five corresponding control individuals while second group contained the six SMA patients with III and IV type of disease and six the corresponding control individuals (Table 1).

Normalization

Quantile normalization was performed on the M-values of all the 469 977 CpG sites using the lumiMethyN function of the lumi package.

Data processing

Modelization of the methylation level of each CpG site

The relationships between methylation level and variables of interest were determined using limma’s robust regression method (lmFit command with setting method=‘robust’, maxit=10 000) to fit the following linear model (1) for each CpG site k:

where Mk is the log2 transformed methylation level of CpG site k, H is the dichotomized health state (control=0 and patient=1), A is the age and ɛk is the unexplained variability. The coefficients bkx summarizes the correlation between the methylation level and the variables of interest. Moderated t-statistics for each variable and CpG site were created using an empirical Bayes model as implemented in limma (eBayes command), in order to efficiently correct batch effects.24 To control for false positives, P-values were adjusted for multiple comparisons as proposed by Benjamini and Hochberg (BH).25 Adjusted P-values >0.05 were considered nonsignificant.

Modelization of the methylation level of each region

The similar linear model (2) was implemented for region-based differential methylation analysis for each of the 11 regions included in the 450k BeadChip annotation:

where Mk is the average log2 transformed methylation level of gene/CpG island k. More detailed description can be found in Supplementary 1.

Gene ontology (GO) enrichment analysis

In each group, the gene or genes associated with each gene-based site was identified. The web-based DAVID functional annotation tool (www.david.abcc.ncifcrf.gov) was used to determine the functional groups to which the genes belonged to, and to extract the functional groups most represented in each of the gene lists. All genes from the 450k BeadChip annotation file were used as the background list. The classification stringency was set as ‘medium’ (default).

Results

Which genes are differentially methylated between SMA patients and controls?

We aimed to identify genes with distinct methylation patterns between SMA patients and healthy individuals. To this end, the link between the methylation profile and SMA was first identified by fitting a linear model that explains the methylation level for each CpG site and for each of the 11 annotated regions. For each of the linear models, we conducted two separate differential methylation analyses: one in first analyzed group, and another in second analyzed group. Then, we tallied both analyses with each other and looked for similarities between them.

Analysis at the CpG sites level

Overall, 1512 and 873 CpG sites were found to be significantly differentially methylated between patients and controls, in first and second groups, respectively. The significant CpG sites were separated into two groups: ones considered significantly more methylated in patients than controls (positive fold change), and ones significantly less methylated in patients than in controls (negative fold change; Figure 1). Only genes associated with sites displaying the same fold change direction in both groups were considered for further analysis. In total, 10 common CpG sites were found to be significantly differentially methylated between patients and controls in both first and second analyzed groups two with a positive fold change, and the remaining 8 having a negative fold change (Figure 2). Seven of these sites were associated with nearby genes (Table 2). In addition, two sites located in the gene body of ARHGAP22 were found to have a significant negative fold change exclusively in second analyzed group.

Figure 1
figure 1

Venn diagrams of the significant CpG sites/genes, which are more methylated in patients than in controls (a) and less methylated in patients than in controls (b). (a) Among 15 CpG sites there were 2 common significant CpG sites for two analyzed groups and 13 common genes with at least one significant CpG sites but not the same for 2 analyzed groups. (b) Among 25 CpG sites there were 8 common significant CpG sites for 2 analyzed groups and 17 common genes with at least one significant CpG sites but not the same for 2 analyzed groups.

Figure 2
figure 2

Bar plot of the β-values of 10 CpG sites significantly different methylated in patients than in controls for both analyzed group. First and second analyzed groups are pooled. The vertical black lines are the standard errors of the means. The significance level is presented as * if 0.05>P-value>0.01, ** if 0.01>P-value>0.001 and *** if P-value<0.001 in blue for first analyzed group and in red for second analyzed group.

Table 2 Information about 10 common significantly different methylated CpG sites or genes associated with these CpG sites

Apart from these 10 sites, we identified 30 other genes for which there was at least one significant CpG site with the same fold change in both groups. However, the concerned sites were not the same in the two groups, and often in different regions (Supplementary Table). Description of methylation level for some of these genes can be found in Supplementary 1.

The P-values calculated for the 10 common sites belonged to the lowest P-values of all the significant gene/CpG sites found in common between both groups (Supplementary Figure 1), thus supporting the significance of these sites. In addition, when applying a more stringent P-value adjustment (Bonferroni correction instead of BH correction), only the CpG sites associated with LIAS/RPL9 and ARHGAP22 remained significant in both groups.

Analysis at the region level

Description of results of the analysis at the region level can be found in Supplementary 1.

To what GO terms do the genes belong?

Gene enrichment analysis revealed eight GO clusters in first analyzed group that displayed an enrichment score >5, including three clusters involved in the regulation of transcription (enrichment scores 10.3, 7.7 and 6.6). In second analyzed group, only two clusters displayed an enrichment score >5, including one cluster involved in cancer pathways (enrichment score 6.5).

Discussion

In this study, we performed the first genome-wide methylation analysis between patients with severe and mild forms of SMA and healthy individuals of the same age, in order to identify SMA modifying methylation changes. We found 10 CpG sites (Table 2) having significantly different methylation levels both between severe form SMA patients and the corresponding controls, and SMA patients with the mild form and the corresponding controls (Table 1).

It is clear that axon and synaptic pathology has an important role in the development of SMA as a significant reduction in number together with morphological and functional alterations of neuromuscular junctions (NMJs) and neurofilament aggregation have been found in severe SMA mice.26, 27 Moreover, defects in motor neuron axon outgrowth and path finding as well as decrease in the growth cone area, have been observed in different SMA animal models and cell cultures.28, 29, 30 All these processes are connected with disturbances in the regulation of actin filament polymerization and disassembly performed by actin-binding proteins.31 The decreased methylation level at one CpG site in the CHML gene and increased methylation level at another CpG site in the ARHGAP22 gene, between patients and controls in both first and second analyzed groups, is of particular interest in this context. CHML encodes an Rab Escort Protein 2 (REP2) that assists in the geranylgeranylation of most Rab GTPases, whereas the ARHGAP22 gene encodes a Rho GTPase-activating protein (RhoGAP). It is known that Rab and Rho GTPases are important regulators of actin dynamics.32 Using the String 9.0 database,33 we found that the product of the CHML gene, REP2 interacts with Rab5A, Rab3A, Rab1A, Rab6A GTPases and ARHGAP22 protein interacts with RhoB, RhoD, RhoF, RhoG, RhoH, RhoU, RhoQ GTPases (string-db.org, medium confidence score 0.400).

The isoprenol modification of the Rab C-terminus, catalyzed by the REP2 protein, is necessary for Rab GTPases to interact with the membrane and to regulate vesicular trafficking.34, 35 Rab proteins are involved in processes of membrane trafficking, including vesicle formation, vesicle movement along actin and tubulin networks, and membrane fusion.32, 36 The Rab3A GTPase is involved in targeting synaptic vesicles to the active zones and in neurotransmitter release in nerve terminals.37 The number of Rab3A-depleted NMJs was about 25% higher in P0-1 SMA mice than in healthy control mice. A marked decline in Rab3A expression was also observed at P5-6 SMA mice.38 Several Rabs appear to be linked to microtubule- or actin-based motor proteins.36 Indeed, Rab6A interacts with myosin II, dynaein–dynactin complex in microtubule-dependent transport pathways from and to Golgi complex.39, 40 According to the FunCoup database, Rab6A interacts with profilin II (PFN2) pfc-value of 0.552 (Fun-Coup database). Profilin II, which has been confirmed to interact directly with SMN, is a regulator of actin dynamics and is required for actin polymerization in the synapse.6, 41

The product of the ARHGAP22 gene, RhoGAP, converts Rho GTPases to an inactive GDP-bound state. Rho GTPases are known as important regulators of initiating, growth, guidance and branching of axons.32, 42 Cooperation of Rho GTPase effectors, ROCK and Formin family of proteins, is supposed to form actin filaments and actomyosin bundles.43 On one hand inactivation of the Rho/ROCK pathway increases the amount of dephospohorylated profilin IIa, leading to F-actin destabilization.32 On other hand, Rho/ROCK action may negatively regulate the early axon outgrowth development in cultured neurons.32 It is important to note that increased activation of RhoA GTPase was identified in the spinal cord of intermediate SMA mice and the inhibition of the main RhoA target ROCK contributes to their lifespan extension.44 We found increased methylation levels at one CpG site in the ARHGAP22 gene in both severe and mild SMA patients compared with the corresponding controls that may explain the decreased expression level of this gene and consequently the lower level of Rho GTPase inhibition in the patients.

This allows us to hypothesize that proteins that influence Rab and Rho GTPase activity could be SMA severity modifying factors. We propose an hypothetical simplified scheme showing the possible involvement of the ARHGAP22 gene in actin dynamics regulation and the potential connection of this process to SMN (Figure 3).

Figure 3
figure 3

Simplified scheme reflecting involvement of ARHGAP22 gene in actin dynamics and cytoskeleton regulation and possible connection of SMN with this process. ARHGAP22 may influence on the activity of RhoB, RhoU, RAC1 GTPases. RhoB GTPase affects ROCK and Formin family of proteins. ROCK is known to phosphorylate myosin light chain, myosin phosphatase and LIM kinase (LIMK). LIMK in its turn phosphorylate and inactivates actin-depolymerizing and severing factor, cofilin. Formin proteins catalyze actin nucleation and polymerization. These processes lead to actin filaments and actomyosin bundles assembly. RAC1 and RhoU GTPase affect actin polymerization through LIMK/cofilin pathway. But the activation of Rho/ROCK pathway may contribute to profilin IIa dephosphorylation and accordingly F-actin destabilization. SMN interacting with profilin IIa may influence on its association with ROCK.

Another differently methylated CpG site is linked to the CDK2AP1 gene. This gene encodes cyclin-dependent kinase 2 associated protein 1, a negative regulator of CDK2. Through downregulation of CDK2 kinase CDK2AP1 protein inhibits G1/S transition of cell cycle.45 Moreover, CDK2AP1 is implicated in apoptosis stimulation and cell proliferation reduction.46 In the context of SMA pathogenesis, CDK2AP1 could be involved in apoptosis initiation of motor neurons. Apoptosis has been demonstrated to have a crucial role in SMA motor neuron degeneration.47, 48

A difference in methylation level was found for a CpG site related to the CYTSB gene (sperm antigen with calponin homology and coiled-coil domains 1). NSP5a3b and NSP5b3b protein isoforms encoded by CYTSB are predicted to have a CH (Calponin-like) domain, which is similar to the CH domain of actin-binding proteins.49 The presence of this domain suggests that these proteins might interact with actin.49 NSP5a3a isoform is possibly involved in RNA metabolism and RNA processing as it interacts with nuclear phosphoprotein B23 and the heterogeneous ribonucleoprotein hnRNP-L. Also, the NSP5a3a isoform could be involved in the induction of apoptosis through an unknown p73-dependent process.50 We therefore speculate that the CYTSB gene may have an effect on SMA severity through its interaction with actin or participation in apoptosis processes.

One differentially methylated CpG site is associated with the SLC23A2 gene, which encodes sodium/ascorbate cotransporter.51 This transporter provides high ascorbate concentration in most tissues. In the CNS, ascorbate has several functions including antioxidant protection, peptide amidation, myelin formation, synaptic potentiation and protection against glutamate toxicity.52 Ascorbate has neuroprotective properties against the oxidative damage associated with different neurodegenerative diseases such as Alzheimer’s, Parkinson’s and Huntington’s disease.52 Thus, the decreased methylation level of CpG site in the SLC23A2 gene in both groups of SMA patients, assuming there is increased expression level of this gene, might denote the SLC23A2 gene could fulfill a certain compensatory function for motor neurons and aid in their survival in SMA patients.

Differences between SMA patients and healthy individuals in the methylation level of a CpG site in the RPL9 gene are not clearly associated to SMA. The RPL9 gene encodes a protein belonging to the L6P family of ribosomal proteins. Putative SMN function in local translational regulation has been hypothesized previously.53 SMN was also found in stress granules of neurons where the translation of some proteins can be blocked during stress reaction.5 It was also demonstrated that SMN is associated with polyribosomes in dendrites.54 The decreased methylation level that we identified in one CpG site in the RPL9 gene in both severe and mild SMA patients might suggest the increased level of the RPL9 gene expression and accordingly the increased amount of RPL9 protein. It is of interest to note that more than a twofold upregulation of RPL32 (ribosomal protein L32), RPL18 (ribosomal protein L8) and more than a twofold downregulation of Rps7 (ribosomal protein S7) has been found in the rostral LAL (levator auris longus) muscle from P1 SMA mice, compared with littermate controls.55

For the remaining four CpG sites strong association with any gene was not demonstrated (Table 2). Additional description and discussion about these sites can be found in Supplementary 1.

The additionally 30 genes (Supplementary Table 1), which contain two or more CpG sites differently methylated between patients and the corresponding controls, but not the same for two analyzed groups, could also be considered as possible candidates as severity modulators of SMA. The group of genes, which are involved in transcription and gene regulation processes: KCNQ1, ATF7IP, WWTR1, NCOR2, PPP1R13L, DIP2C, CD3EAP, WHSC1 (string-db.org, GO enrichment analysis, medium confidence score 0.400) should be noted here. Also among the differentially methylated CpG sites in first analyzed group, gene enrichment analysis revealed three GO clusters of genes that are also involved in transcription regulation. The putative role of SMN in transcription regulation has been previously discussed.56, 57 Discussion for the genes not brought up here can be found in Supplementary 1.

To conclude, we found several differentially methylated candidate loci containing genes that could be possible modifiers of the SMA severity. Several of these identified genes appear to be associated with the cytoskeleton system, processes of neuronal development and maintenance, apoptosis and transcriptional regulation. Further studies are needed to confirm if the peculiarities in the methylation pattern of these genes in blood cells is a distinctive feature, as well as for motor neurons, as the methylation profile of somatic cells may be different.58 The association between the methylation profile and gene expression levels should also be explored as genes may display a relatively poor correlation between CpG methylation status and transcriptional activity and even hypomethylation in TSS may anticorrelate with gene expression.14, 58 Thus, further research will help to make clear the contribution of these candidate genes in the molecular basis of SMA pathology.