Introduction

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative condition that leads to paralysis and death often within 5 years of symptom onset.1 Following the discovery in 1993 that mutations of the SOD1 gene account for 10% of familial ALS,2 there has been an intensive effort to understand the genetics of both familial and sporadic ALS (SALS). Significant progress has been made in familial ALS, with mutations identified in several genes through linkage studies.3 However, 90% of ALS cases occur as apparently sporadic illness, and here progress using candidate gene paradigms has been limited by lack of replication between populations.4

The recent advent of robust high-density genotyping arrays has promised a new era for the identification of genetic risk factors for SALS. The genome-wide association (GWA) approach has the advantage over candidate gene studies of being unbiased by a priori hypotheses about disease biology. GWA has been successful in identifying a single gene of major effect in macular degeneration, using a sample size of less than 100 affected individuals. However, in conditions in which the phenotype may be variable, and where there may be multiple susceptibility genes of small effect, sample size is of critical importance.5 The incidence of ALS, which is 2 per 100 000 person-years, limits the availability of large numbers of patients for study.6 However, the past year has seen the publication of four GWA studies for SALS, each with a different list of most associated single-nucleotide polymorphisms (SNPs).7, 8, 9, 10 These studies were designed such that each has the power only to identify variants of large effect size. As it is now clear that no single locus drives genetic risk for SALS, the contribution of common genetic variability to the phenotype must rest upon the contribution of many genes, each conferring a relatively modest increase in risk. To identify such risk factors, investigators have turned to mining the individual GWA data sets for commonly associated SNPs.10, 11

Using such an approach, van Es et al11 in the Netherlands identified 15 SNPs commonly associated with SALS in the US and Dutch GWA data sets, and investigated replication of these SNPs in three additional SALS control populations from Sweden, Belgium and the Netherlands.11 They were the first to identify a common association of rs10260404, an intronic variant in the gene encoding dipeptidyl-peptidase 6 (DPP6), among each of their five study populations.11 Using a similar methodology, we recently reported a joint analysis of the Irish, US and Dutch GWAs.10 Interestingly, the same variant was the top hit in our analysis, confirming and extending the supportive data for DPP6 to the Irish.

Here, we present a follow-up on our joint analysis signals in an expanded Irish population and an independent Polish population. Using a list of SNPs that are commonly associated in three GWAs, we test for confirmation of association with ALS by sample addition.

Materials and methods

Study design and aims

In phase I of the study, we sought to identify SNPs commonly associated with SALS in the three publicly available genome-wide data sets from Ireland, the United States and the Netherlands.8, 9, 10 In phase II, we genotyped this list of SNPs in 913 additional samples from the Irish and Polish populations. We hypothesized that a truly associated SNP would attain a genome-wide level of significance in the expanded data set (phase I and II).

Participants

Table 1 shows the demography of the three publicly available GWA series used in phase I. The Irish series included 432 unrelated participants of self-declared Irish ethnicity for at least three generations (221 ALS cases and 211 neurologically normal controls).10 Genotyping was performed using Illumina HumanHap 550K SNP chips (Illumina Inc., San Diego, CA, USA) and raw sample-level genotyping data have been made available at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000127.v1.p1. The US series comprised 276 unrelated, white, non-Hispanic individuals diagnosed with ALS and 271 neurologically normal individuals of similar ethnicity drawn from across the United States.8 Genotyping was undertaken using Illumina HumanHap 550K SNP chips and full phenotype and raw genotyping data are available at https://queue.coriell.org/Q/index.asp. The Netherlands series included 911 unrelated individuals with all four grandparents born in the Netherlands (461 ALS cases and 450 neurologically normal controls). This cohort was genotyped using the Illumina HumanHap 300K platform.9 Minor allele frequencies (MAFs) and allelic P-values for each genotyped SNP have been made available at http://www.alscentrum.nl/index.php?id=GWA. Stringent quality control criteria have been applied to these data sets as previously described.8, 9, 10 In particular, samples with genotyping call rates below 95% and cryptically related individuals have been removed.

Table 1 Characteristics of the study populations

The demography of the phase II study participants is also given in Table 1. All Irish DNA samples were collected at Beaumont Hospital in Dublin, Ireland, and the Polish DNA samples were collected at the Department of Neurology, MND Clinic, Jagiellonian University, Krakow, Poland. All patients fulfiled the 1994 El Escorial criteria12 for probable or definite SALS and were phenotyped by physicians with expertise in ALS. Patients with a family history of ALS have been excluded from the study. Control DNA samples were collected from healthy unrelated neurologically normal individuals, either spouses of ALS patients or those accompanying non-ALS patients. Controls were matched for age, gender and ethnicity. All participants gave written informed consent and local ethics review boards approved all procedures.

Phase I: selection of commonly associated SNPs from the GWAs

The power of the individual Irish, US and Dutch genome-wide ALS association studies is low to detect loci with moderate effect sizes. Nonetheless, the signal from truly associated SNPs might be present, although weak. The size of the Dutch cohort is approximately double that of each of the Irish and US cohorts. Thus, to select markers for replication, we identified those SNPs commonly associated for the same allele at an allelic P-value below 0.05 in the Dutch GWA study and at an allelic P-value below 0.1 in each of the Irish and US GWA studies. SNPs with MAFs below 0.01 or deviations from Hardy–Weinberg equilibrium (HWE) in controls below 0.01 were excluded. The selected SNPs were then genotyped in phase II.

Phase II genotyping

Genotyping of the selected SNPs was performed in 139 additional Irish participants (91 patients with SALS and 48 control subjects) and 574 Polish participants (218 patients with SALS and 356 controls). DNA was extracted from fresh venous blood using standard procedures. Genotyping was performed by KBiosciences (Herts, UK) using KASPar chemistries. Quality control criteria included mixing of case and control samples on individual plates, inclusion of water controls as negatives, sample duplicates between plates and blinding of data. The mean SNP call rate for the Irish study population was 98.6% and the lowest call rate was 96.1%. The mean SNP call rate for the Polish study population was 97.9% and the lowest call rate was 94.1%. To facilitate confirmation of strand and direction between the Illumina and KASPar genotyping assays, we also genotyped 27 SNPs in 10 replicate individuals from the Irish genome scan using the KASPar platform. The genotyping concordance rate between the Illumina and KASPar assays was 99.3%, with the non-concordant calls being accounted for by two failed assays on the KASPar platform.

Statistical analysis

Each SNP was tested for allelic association with SALS by the χ2 test of independence. Estimations of departures from HWE were calculated by the χ2 test in controls. Analyses were computed using PLINK v 1.01 software.13 Power statistics were calculated assuming an MAF of 0.4 for SNPs conferring an odds ratio of 1.37.14 For phase I, the power of the Irish and US GWA data sets to detect such SNPs is 74 and 83%, assuming an α of 0.1, whereas the Dutch GWA has 90% power to detect the same SNPs assuming an α of 0.05.

In phase II, we report uncorrected allelic P-values for the Irish (n=139) and Polish (n=574) replication samples. We have not applied Bonferroni correction for the 27 SNPs tested in the replication populations as only nominal trends are likely to be detected within individual populations at the sample sizes we have used. Therefore, meaningful conclusions can only be drawn from strengthening of association across the study by addition of the replication samples to the GWA data. Pooled genotype data for the SNPs common to all three GWAs (phase I) have only 34% power to reach genome-wide Bonferroni threshold (P-value below 1.73 × 10−7 accounting for 287 522 tests); by contrast, the final study data (phase I and II) has 66% power to reach this level of significance. Importantly, a priori, we set genome-wide Bonferroni significance in the combined phase I and II data as the threshold to declare significant association of any SNP with ALS.

Results

In phase I, we compared the Irish (221 SALS cases; 211 controls), US (276 SALS cases; 271 controls) and Dutch (461 SALS cases; 450 controls) genome-wide data sets.8, 9, 10 There were 287 522 autosomal SNPs in common between these studies that were successfully genotyped in all the three cohorts. Among these, there were 27 SNPs associated for the same allele at a P-value below 0.1 in the Irish and US and below 0.05 in the Dutch cohort (Table 2 and Supplementary Table 1). As previously reported,10 the strongest pooled association was for an intronic variant, rs10260404, in the DPP6 gene.

Table 2 Minor allele frequencies and P-values for the 27 selected SNPs in the pooled Irish, US and Dutch GWA studies (phase I) and following addition of the replication Irish and Polish samples to the GWA data (phase I and II)

In phase II, the 27 selected SNPs were genotyped in the additional Irish (91 cases and 48 controls) and Polish (218 cases and 356 controls) populations. Eleven of these SNPs, including rs10260404, reached a final uncorrected allelic P-value below 0.05 in the expanded Irish data set (Supplementary Table 2). None of these also showed evidence of association in the Polish ALS population. Indeed, only rs6922711, an intronic marker in the gene encoding Rho GTPase-activating protein 18 (ARHGAP18), reached a final uncorrected allelic P-value below 0.05 in the Polish sample (Supplementary Table 2).

The most important results are the effects of combining phase I and phase II. On analysis of combined genotyping data, for a total of 1267 patients with ALS and 1336 controls, only 4 of the 27 SNPs (rs6922711, rs1942239, rs10024717 and rs2159942) gained in strength of allelic association compared with the phase I GWA data alone (Table 2). No SNP reached a final level of genome-wide significance in the combined analysis.

Discussion

Several lines of evidence support the hypothesis that genetic susceptibility factors contribute to ALS pathogenesis. Epidemiological studies have established that between 2 and 5% of ALS cases are familial, most often with an autosomal dominant pattern of inheritance.3 Linkage studies have identified causal variants in a portion of these families, establishing that the phenotype may develop as a monogenic process.2, 3 The expression of mutant SOD1 in mouse models produces an ALS phenotype.15 Finally, twin concordance studies indicate that the heritability of SALS is between 0.38 and 0.85.16 These observations have led to the suggestion that SALS is a complex oligogenic disease, requiring the interaction of several genes in a single patient, each contributing a portion of the underlying risk.17

We hypothesized that common SNPs contributing such a modest effect would be detected in the recent GWA studies of SALS. Hoping to achieve genome-wide significance for 27 SNPs, we genotyped additional Irish and Polish individuals. One SNP, rs6922711 in ARHGAP18, showed nominal association in the Polish cohort and gained in overall strength of association. However, no SNP showed a convincing pattern of association across all individual study populations, nor did any SNP achieve Bonferroni significance overall.

Perhaps the most unanticipated finding from our study is the lack of replication of the rs10260404 variant in the DPP6 gene. A compelling replication at this locus has been demonstrated in five previous populations of Northern European ancestry11 and by us in the Irish GWA.10 Importantly, and for the first time in SALS genetics, the same risk allele of the same SNP was seen in each of the populations. However, in this paper, the non-risk allele was markedly over-represented among ALS cases in a Polish population. Although we concede that sampling error and chance may account for these findings, the degree of difference in allele frequencies in the Polish groups, despite a relatively uniform pattern in six other European case–control data sets, leads us to hypothesize a mechanism for population-specific difference that could account for the DPP6 findings.

If motor neuron degeneration is regarded as one pole of a continuum of normal variation, then it may be that a constellation of common SNPs, when inherited together, contributes to the onset of ALS. In this instance, it seems likely that the risk polymorphisms will be common among populations of European ancestry.5 By contrast, SALS could equally result from rare mutations, whether single point, copy number or epigenetic, with variable penetrance or de novo change accounting for the non-familial pattern of incidence.3, 4 Unlike SNPs, which have relatively uniform frequency between Caucasian populations,18 mutations may segregate within major ethnic pockets. For example, the frequency of LRRK2 gene mutations in several series of patients with sporadic Parkinson disease is 0.6–2% in the United States, 4–8% in Spain and Portugal, 10–18% in the Ashzenazi Jewish population and over 40% in Arab populations.19, 20, 21, 22, 23 In a sporadic mutation model of ALS, common SNPs may still modulate the penetrance or phenotype of a particular mutation through epistasis.24 Such SNP-mutation epistasis has been demonstrated in a recent GWA of unselected breast cancer cases (over 4300 case–control pairs) that detected that SNPs in the genes FGFR2 and MAP3K1 increase breast cancer susceptibility.25 A follow-up study segregating phenotypes by BRCA mutation indicated that these SNPs acted exclusively in BRCA2 mutation carriers and were not risk factors for BRCA1-related breast cancer.25 SNP–mutation interactions mean that the findings of a case–control association study of SNPs in a given population will vary depending on the underlying prevalent mutations in that population.25, 26 This may account for the failure of replication in some geographical distributions and poses challenges for the GWA approach to identify risk variants for SALS.

An alternative view is that SNPs should associate equally with disease risk across populations and that the DPP6 association in Northern Europeans is false positive. Indeed, the obvious limitation of our study is the relatively small sample size of the GWA data sets used to identify markers for phase II, although at least for the DPP6 variant, supportive data are available from additional populations. Even with the combined phase I and II data, power remains low to detect the oligogenic influence of SNPs. It is also important to bear in mind that the primary aim of our analysis was to add additional sample size to the GWA data. Consequently, nominal P-values reported for individual populations must be interpreted with caution. Given the modest effects now expected for SNPs in SALS, the next step should be the expansion of sample size.

In summary, we tested whether commonly associated SNPs from three recent SALS genome-wide studies would attain greater significance when tested in an expanded cohort. Our findings caution against drawing conclusions, positive or negative, from the presently available GWA data sets. Further whole-genome genotyping both within and between populations will be necessary to fully define the contribution of SNPs to SALS.

Conflict of interest

None declared.