Introduction

Amyotrophic lateral sclerosis (ALS) is the most common motor neuron disease characterised by progressive skeletal muscle atrophies that lead to death mostly within 3–5 years from disease onset1. Mutations in more than 30 genes have been identified to cause ALS, including SOD1, TARDBP, FUS, and C9orf72. Approximately 50–70% of ALS patients with a family history of the disease can be attributed to mutations in these genes; however, these mutations are found in only 3% of the Japanese patients with sporadic ALS2. A twin study estimated the heritability of sporadic ALS to be 61%3; therefore, sporadic ALS is thought to be a multifactorial disease to which multiple genetic and environmental factors contribute. Previous genome-wide association studies (GWAS) reported six common loci showing genome-wide significant associations with ALS, which only explained 0.2% of its heritability4. To elucidate the pathophysiology of sporadic ALS and to develop appropriate therapies, the genetic background of sporadic ALS needs to be understood more clearly.

Assuming that common causal variants in different populations are important in investigating the pathophysiology of sporadic ALS, cross-ethnic meta-analysis of GWAS would have valuable implications. In addition, cross-ethnic GWAS has an advantage for fine-mapping because the linkage disequilibrium patterns that differ across populations can improve the resolution5. Thus, we report analyses of novel genome-wide association study data of 1173 sporadic ALS cases and 8925 normal controls in a Japanese population and meta-analysis with the largest ALS study in a European population6. We also validate the candidate region with a combined replication study using 707 other ALS cases and 971 controls from a Japanese population and a Chinese dataset7. Using a gene-based analysis, we identify four additional genes significantly associated with ALS. The discovery of novel risk genes advances our understanding of ALS aetiology.

Results

Genome-wide association analysis

To identify new susceptibility loci operating in sporadic ALS, we conducted a GWAS in a Japanese sample of 1173 sporadic ALS cases and 8925 controls (Supplementary Figs. 1, 2, and 3). The 56 SNPs passed the condition with p < 5.0 × 10−6 (Supplementary Data 1), while no individual SNPs passed the genome-wide significant p-value threshold of 5.0 × 10−8. The data were then incorporated into a meta-analysis with a large-scale GWAS involving 20,806 patients diagnosed with ALS and 53,439 control subjects of European ancestry6. The Manhattan plot in Fig. 1 shows four identified loci in the European population that achieved genome-wide significance (Fig. 1a; loci include GPX3-TNIP1, C9orf72, TBK1, and UNC13A; Fig. 1b)4,6,7. In addition, we found the novel genome-wide statistical signature for SNPs in linkage disequilibrium on chromosome 10q25.2. This region spans several hundred kilobases and encompasses ACSL5 (Supplementary Data 2). Table 1 lists the top three SNPs displaying p < 5.0 × 10−8 in the ACSL5 gene body region: rs58854276 (p = 2.97 × 10−8; odds ratio (OR) = 1.080; 95%CI = 1.065–1.095), rs11195948 (p = 3.99 × 10−8; OR = 1.079; 95%CI = 1.064–1.094), and rs3736947 (p = 3.61 × 10−8; OR = 1.080; 95%CI = 1.065–1.095). The top three SNPs in ACSL5 are all intronic variants. Figure 2 shows the regional mapping of this region.

Fig. 1: Meta-analysis between European and Japanese genome-wide association studies (GWAS) revealed a novel locus.
figure 1

a Manhattan plot of the meta-analysis between European and Japanese GWAS. The novel locus ACSL5 is shown in red. Loci identified in other studies are presented with lines along the gene names, i.e. GPX3-TNIP1, C9orf72, TBK1, and UNC13A. The red horizontal line indicates the genome-wide significance threshold of p = 5 × 10−8. The blue line indicates the suggestive threshold of p = 5 × 10−6. The p-values of Japanese GWAS were calculated from the logistic regression model. b Q–Q plot of the meta-analysis between European and Japanese GWAS.

Table 1 Top three SNPs displaying p < 5.0 × 10−8 in the ACSL5 gene body region by GWAS of the meta-analysis between European and Japanese (JaCALS) cohorts.
Fig. 2: Regional ALS association plot of the ACSL5 locus from the meta-analysis results using LocusZoom.
figure 2

The meta-analysis between European and Japanese GWAS shows rs58854276 as the SNP most strongly associated with ALS (p = 2.97 × 10−8).

Replication analysis

The association result was replicated in the combined meta-analysis with the Chinese dataset (1234 ALS cases and 2850 controls)7 and an independent new Japanese dataset (707 ALS cases and 971 controls). Among the former top three SNPs, two SNPs—rs11195948 and rs3736947—were validated (Table 1, Fig. 3, Supplementary Fig. 4). The p-value of rs11195948 was the most significant (p = 1.82 × 10−4; OR = 1.136; 95%CI = 1.098–1.176). The tendency of the ratio in the Chinese sample was in the same direction (p = 2.09 × 10−4; OR = 1.199; 95%CI = 1.103–1.290, Table 1, Fig. 3) as that of the European and initial Japanese samples in the discovery dataset. However, the result of the Japanese replication dataset was not conclusive (p = 0.554; OR = 0.960; 95%CI = 0.838–1.100, Fig. 3, Supplementary Fig. 4). For the combined dataset of the discovery and replication studies, the most significant SNP was rs3736947 (p = 7.81 × 10−11; OR = 1.087; 95%CI = 1.073–1.101, Table 1, Fig. 3).

Fig. 3: Forest plots showing the effects of rs3736947 in ACSL5 in each cohort and meta-analysis.
figure 3

Forest plots showing the effects of rs3736947 on ALS in each cohort and meta-analysis.

Multi-ethnic meta-analysis

To maximise the power to detect genes related to ALS, we conducted the largest meta-analysis, with 23,213 cases and 71,579 controls from Japanese, European, and Chinese populations (Fig. 4a, b, and Supplementary Data 3). In the meta-analysis, the SNPs in ACSL5 also reached genome-wide significance: rs58854276 (p = 6.23 × 10−10), rs11195948 (p = 1.79 × 10−11), and rs3736947 (p = 1.84 × 10−11). In addition, a novel locus in the long non-coding RNA TSBP1-AS1 reached genome-wide significance (rs140736091, p = 1.36 × 10−8), although this locus requires a replication study.

Fig. 4: Multi-ethnic meta-analysis among European, Japanese, and Chinese genome-wide association studies (GWAS).
figure 4

a Manhattan plot of the multi-ethnic meta-analysis among European, Japanese, and Chinese GWAS. The novel loci ACSL5 and TSBP1-AS1 are shown in red. Loci identified in other studies are presented with lines along the gene names, i.e. GPX3-TNIP1, C9orf72, TBK1, and UNC13A. The red horizontal line indicates the genome-wide significance threshold of p = 5 × 10−8. The blue line indicates the suggestive threshold of p = 5 × 10−6. The genes indicated in red text colour with lines are novel genes and other genes are known risk genes. b Q–Q plot of the multi-ethnic meta-analysis among European, Japanese and Chinese GWAS.

Functional analysis of ACSL5

A multi-tissue eQTL database GTEx v8 (https://www.gtexportal.org/home/)8 revealed a significant relationship between rs3736947 and ACSL5 expression. The risk allele (C) is associated with higher ACSL5 expression than the non-risk allele (A) (p = 6.5 × 10−49, normalised effect size = −0.39 in the whole-blood dataset). To confirm the association, we conducted gene expression analysis for ACSL5 by quantitative real-time PCR using lymphoblastoid B cell lines (LCLs) from age-matched and sex-matched Japanese patients with ALS with each genotype. The expression of ACSL5 was significantly higher in the LCLs of patients with the CC (n = 20) and AC (n = 20) genotype of rs3736947 than in those with the AA genotype (n = 20) (CC vs AA, p = 0.0035, AC vs AA, p = 0.0166, Kruskal–Wallis test followed by Steel–Dwass test, Fig. 5, Supplementary Data 4).

Fig. 5: Relative expression of ACSL5 mRNA in LCLs from ALS patients with each genotype of rs3736947 in the ACSL5 gene.
figure 5

Gene expression analysis for ACSL5 was conducted by quantitative real-time PCR using lymphoblastoid B cell lines (LCLs) from Japanese patients with sporadic ALS. The mRNA expression of each genotype of rs3736947 was compared using the Kruskal–Wallis test followed by Steel–Dwass test. The expression of ACSL5 was significantly higher in the LCLs of patients with the CC (n = 20) and AC (n = 20) genotype of rs3736947 than in those with the AA genotype (n = 20) (**CC vs AA, p = 0.0035, *AC vs AA, p = 0.0166). Circles represent individual data points. The bottom and the top of the box indicates the interquartile range (the 25th and 75th percentiles) and the line represents the median. The whiskers under and over the box correspond to the minimum and maximum values.

Gene-based association analysis

We performed gene-based association analysis in the largest meta-analysis set, with 23,213 cases and 71,579 controls from Japanese, European, and Chinese populations. The gene-based test confirmed that the already discovered genes, TNIP1 (p = 3.91 × 10−7)7, C9orf72 (p = 1.96 × 10−21)9, KIF5A (p = 1.06 × 10−7)6, and SCFD1 (p = 2.25 × 10−6)4. In addition, the novel genes ERGIC1 (p = 5.90 × 10−7), RAPGEF5 (p = 3.71 × 10−7), FNBP1 (p = 9.59 × 10−9), ACSL5 (p = 4.50 × 10−11), and ATXN3 (p = 5.81 × 10−7), also reached genome-wide significance (Table 2, Fig. 6a, b, Supplementary Data 5).

Table 2 Summary of significant genes in the multi-ethnic meta-analysis of gene-based association analysis among Japanese (JaCALS), European, and Chinese cohorts.
Fig. 6: The gene-based analysis for the multi-ethnic meta-analysis among European, Japanese, and Chinese cohorts.
figure 6

a Manhattan plot of the gene-based analysis for the multi-ethnic meta-analysis among European, Japanese, and Chinese cohorts. The red horizontal line indicates the genome-wide significance threshold of p = 2.85 × 10−6 (=0.05/17544). The blue line indicates the suggestive threshold of p = 2.85 × 10−4. The genes indicated in red are novel genes, and the other genes are known risk genes. b Q–Q plot of gene-based analysis for the multi-ethnic meta-analysis among European, Japanese, and Chinese cohorts.

Discussion

In this study, the region in ACSL5 was discovered as a novel risk locus for sporadic ALS by meta-analysis between Japanese and European datasets and was replicated in the Chinese dataset and another Japanese dataset. Expression analysis showed that the risk allele is associated with increased ACSL5 expression. The expression of ACSL5 mRNA in spinal motor neurons isolated by laser-capture microdissection in 12 sporadic ALS patients and nine controls was catalogued by Batra et al.10,11. ACSL5 mRNA expression was possibly higher in sporadic ALS than in controls (Supplementary Fig. 5; p-value = 0.033 with Mann–Whitney U test). Similarly, another report showed that ACSL5 mRNA expression in the spinal anterior horn was upregulated in sporadic ALS patients compared with that in controls12.

ACSL5 is one of the members of the acyl-CoA synthetase long chain family. Acyl-CoA synthetase produces acyl-CoA for numerous metabolic pathways, such as cellular lipid metabolism; transcriptional regulation; intracellular protein transportation; and protein acylation in various tissues, including skeletal muscle, the liver, and the brain13. ACSL5 is a neurotoxic A1 astrocyte-related gene, and is up-regulated in A1 astrocytes14. A1 astrocytes are abundant in various neurodegenerative diseases, including ALS, and they induce the death of neurons in the central nervous system15. We speculated that increased expression of ACSL5 could induce A1 astrocytes, cause motor neuron death, and lead to ALS development. Another possible reason why ACSL5 is associated with ALS may be related to lipid metabolism. The risk allele (A) of rs58854276 has been reported to be associated with lower HDL-cholesterol in Japanese individuals16. There have been several reports that dyslipidaemia increases the risk of ALS17,18. However, the association between ALS and dyslipidaemia has not been replicated in other studies19. Further studies are warranted to clarify the association between ACSL5 and ALS onset.

In the largest meta-analysis with 23,213 cases and 71,579 controls from Japanese, European, and Chinese populations, we identified a novel locus at 6p21, which reached genome-wide significance, in addition to ACSL5. The top SNP in the locus (rs140736091) was in the long non-coding RNA TSBP1-AS1. Some studies have reported that long non-coding RNAs are associated with ALS, but their role in ALS still needs to be elucidated20. Therefore, further replication study and functional analysis will be needed to clarify the association between rs140736091 and patients with ALS.

The gene-based test for the largest multi-ethnic meta-GWAS from Japanese, Chinese and Europeans revealed novel genes, ERGIC1, RAPGEF5, FNBP1, and ATXN3, in addition to ACSL5.

ERGIC1 is a membrane-bound protein that is localised to the endoplasmic reticulum (ER)–Golgi intermediate compartment (ERGIC). The ERGIC mediates membrane traffic and selective transport of cargo between the ER and the Golgi complex21. ER–Golgi transport dysfunction is reported to be a common pathogenic mechanism in SOD1-, TDP-43-, and FUS-associated ALS22, and ER stress has been implicated in ALS aetiology. Combined GWAS of genetic overlap between ALS and frontotemporal dementia-spectrum neurodegenerative diseases identified rs538622 near ERGIC123.

RAPGEF5 is a member of the Ras subfamily of GTPases, which function in signal transduction for cell growth and differentiation as guanosine diphosphate (GDP)/guanosine triphosphate (GTP)-regulated switches cycling between inactive GDP- and active GTP-bound states24. RAPGEF5 has been reported to be associated with telencephalic neurogenesis25. The RAPGEF5 transcript is expressed predominantly in the brain.

FNBP1, a member of the formin-binding protein family, is a membrane-associated protein. It plays an important role in clathrin-mediated endocytosis26. FNBP1 is upregulated in the spinal cord of SOD1 G93A mice27.

ATXN3 is a ubiquitously expressed deubiquitinase that plays important roles in the ubiquitin proteasome system, transcriptional regulation and neuroprotection28. A recent meta-analysis of European GWAS suggested an association between SNP (rs10143310) in ATXN3 and ALS, although the SNP did not achieve genome-wide significance (p = 3.2 ×10−7)6. The CAG repeat expansion in the coding region of ATXN3 causes spinocerebellar ataxia type 3 (SCA3)29. SCA3 patients and ALS patients have common pathologies, such as TDP-43-positive inclusions in the lower motor neurons of the anterior horn of the spinal cord and brainstem30. A gene involved in cerebellar ataxia, ATXN2, has also been described as a risk gene for sporadic ALS. An intermediate repeat expansion in ATXN2 is associated with the risk of ALS31.

In conclusion, multi-ethnic GWAS identified the association of the ACSL5 locus with ALS. This locus was identified by combining GWAS results from our Japanese dataset with the largest set of European GWAS data and was replicated in an independent Japanese cohort and a Chinese cohort. In addition, gene-based analysis identified ERGIC1, RAPGEF5, FNBP1, ACSL5, and ATXN3. While these genes reached the discovery stage of the analysis, further replication analysis or functional analysis in ALS is warranted. Nevertheless, the discovery of novel risk loci significantly advances our understanding of ALS aetiology.

Methods

Study subjects in the Japanese cohort

In the discovery cohort, we performed a GWAS in Japanese sporadic ALS cases from the Japanese Consortium for ALS research (JaCALS)32 and in normal controls from the Tohoku Medical Megabank Project (TMM)33,34. In the replication cohort, we obtained DNA from ALS patients registered by BioBank Japan35,36 and normal controls registered in the Pharma SNP Consortium. The ethics committees of the respective research projects approved this study. Written informed consent for this study was obtained from all the participants.

DNA extraction from ALS patients

To extract genomic DNA, peripheral whole-blood samples were processed using an Autopure LS system (Qiagen, Hilden, Germany) for automated nucleotide purification according to the manufacturer’s instructions. We omitted RNase treatment, measured the concentration of the double-stranded DNA with PicoGreen (Life Technologies, Carlsbad, CA, USA), and adjusted the concentration of the DNA to 200 ng/μL in Elution Buffer (Qiagen).

Japanese ALS patients for the discovery study

The case cohort comprised 1,245 candidate sporadic ALS patients recruited from the Japanese Consortium for ALS research (JaCALS), which included 32 neurology facilities in Japan. Patients with a family history for ALS were excluded. The included patients were diagnosed with definite, probable, probable laboratory-supported, or possible ALS based on the revised El Escorial criteria37 and were of Japanese ancestry. The DNA samples used in this study were collected from February 2006 to December 2017. The approval of the overall research was obtained from the Ethics Committee of Nagoya University with number 2004-0181-2. The ethics committees of all participating institutions approved the study. Details of the JaCALS have been described elsewhere32. All 1,245 sporadic ALS patients were genotyped with Illumina (San Diego, CA, USA) HumanOmniExpressExome-8 BeadChips (version 1.0, 1.2, and 1.3).

Japanese controls for the discovery study

The control cohort was the prospective cohort of the Tohoku Medical Megabank Project (TMM)33,34. From these TMM samples, we used 9,924 samples genotyped with Illumina HumanOmniExpressExome-8 BeadChips (version 1.2-B) at RIKEN. Approval to use the dataset as the control was obtained from the Ethical Committee of the Tohoku University Graduate School of Medicine, the TMM, and the Materials and Information Distribution Review Committee of Tohoku Medical Megabank Organization.

Quality control of genotyped SNPs and imputation

For the combined genotyped dataset with 1,245 cases and 9,924 controls, we selected SNPs with minor allele frequency ≥0.03, Hardy–Weinberg equilibrium ≥0.001, and genotype call rate ≥95%. This step retained 516,447 SNPs from 936,632 SNPs. The qualified genotype dataset was imputed with the 2,037 whole-genome reference panel (2KJPN) from TMM38 using Impute 439. The target individuals themselves for the imputation were not included in the 2KJPN reference panel40. According to the protocol by Shido et al.41, the imputed genotype data in Oxford GEN format were converted to Plink BED format by selecting the genotype with the highest posterior probability for each SNP and individual. In the conversion, highest posterior probabilities less than 0.9 were handled as missing genotypes. Finally, we constructed the imputed dataset with 1245 cases and 9924 controls from 27,893,192 SNPs.

Pre-processing and GWAS

We excluded 69 suspected ALS patients and two non-ALS patients from 1,245 cases. Samples that were identical-by-descent (>0.1875) among 1174 cases and 9924 controls were removed. The thresholds were based on other published studies42,43,44,45. Finally, 1173 cases and 8925 controls were used for GWAS. For GWAS, we selected 4,349,201 SNPs with minor allele frequency ≥0.03, Hardy–Weinberg equilibrium ≥0.001, and genotype call rate ≥95%. We conducted logistic regression analysis with principal components 1–20 from the principal component analysis as covariates using Plink (version 1.90b5.1).

Meta-analysis of Japanese and European dataset

We downloaded the summary statistics of the 20,806 ALS cases and 59,804 controls in the European dataset from the website http://als.umassmed.edu/6. Inverse-variance meta-analysis was conducted between the 20,806 ALS cases and 59,804 controls in the European dataset6 and the 1173 cases and 8925 controls of Japanese ancestry using METAL (version 2011-03-25)46. The regions from 114,000,000 to 114,300,000 in chromosome 10, including the significant SNPs in ACSL5, were plotted using LocusZoom (version 1.4) with ‘–build hg19 –pop ASN –source 1000G_Nov2014’ options47.

Japanese replication cohort

We obtained DNA from 707 ALS patients registered by BioBank Japan35,36 and 971 normal controls registered in the Pharma SNP Consortium. Twenty-nine SNPs with a suggestive threshold of p = 5 × 10−6 on chromosome 10q25.2 were genotyped via the multiplex PCR-based target sequencing method. Primers were designed using Primer 3 software. The products of multiplex PCR were sequenced using Illumina HiSeq 2500 (Illumina Inc., San Diego, CA, USA). The sequence data were analysed using a standard pipeline. The details of the methods have been previously described48.

For this analysis, we did not have genome-wide genotype information for these samples. We conducted logistic regression analysis using Plink with no covariate, e.g. principal components, for these 29 SNPs.

Meta-analysis of Chinese and Japanese replication cohort

We downloaded the summary statistics of the 1234 ALS cases and 2850 controls in the Chinese cohort7. Inverse-variance meta-analysis was conducted for the 1234 ALS cases and 2850 controls in the Chinese cohort and 707 ALS cases and 971 normal controls in the Japanese replication cohort using METAL (version 2011-03-25)46. A value of p < 1.72 × 10−3 (= 0.05/29; Bonferroni adjusted) in the replication stage was considered statistically significant.

Multi-ethnic meta-analysis

Inverse-variance meta-analysis was conducted among 1234 cases and 2850 controls in the Chinese cohort7 and the former samples in European6 and Japanese datasets (JaCALS) using METAL (version 2011-03-25)46.

Gene-based analysis of the multi-ethnic dataset

For the summary statistics data of the multi-ethnic meta-analysis of Japanese, European, and Chinese datasets with 3,704,464 SNPs, a gene-based analysis was applied using MAGMA (version 1.07) with the default option49. In the pre-processing step, each SNP was checked to be mapped to a specific gene. If the SNP was located within the gene body region of a gene, the SNP was annotated to this gene, i.e. a SNP in an intergenic region was not annotated to any gene. In the pre-processing step, an annotation window range can be set to include the peripheral regions around the genes. The default value of the annotation window range is zero in MAGMA (version 1.07). The value is the strictest option in this pre-processing step since extra regions around the gene are not included in gene-based analyses. Thus, we used the default option. After the pre-processing, 1,585,558 SNPs in total were annotated to 17,544 genes. For the gene-based analysis, a p-value should be calculated for each gene from the SNPs annotated to this gene in the pre-processing step. We used the default SNP-wise mean model in MAGMA for the calculation step. The SNP-wise mean model (the mean of the χ2 statistic for the SNPs in a gene) is highly similar to the commonly used SKAT model (with inverse variance weights)50. The drawback of this method is that it decreases the power to detect associations for rare variants. In our analysis, the minor allele frequency in the discovery Japanese cohort was more than 0.03, and the problem was negligible. Finally, 13 genes with a genome-wide significance threshold of p = 2.85 × 10−6 (=0.05/17544; after Bonferroni correction) were selected.

Quantitative real-time PCR

Lymphoblastoid B cell lines (LCLs) were prepared from peripheral blood B cells of ALS patients using standard Epstein-Barr virus transformation techniques at the time of registration for JaCALS51. LCLs obtained from 20 patients with age-matched and sex-matched ALS (Supplementary Table 1) for each genotype of rs3736947 were applied to the gene expression analysis. All 60 patients whom we selected to quantify eQTL were diagnosed with sporadic ALS. Further, we performed exome sequencing or target resequencing for all 60 patients who were selected to quantify eQTL. The details of the methods have been previously described2. These patients had no pathogenic variant in the ALS-related genes, such as SOD1, FUS, and TARDBP. Total RNA was extracted from the LCLs using the PureLink RNA Mini kit (Thermo Fisher Scientific, Waltham, MA, USA). Total RNA was transcribed using the SuperScript IV VILO Master Mix (Thermo Fisher Scientific). Real-time PCR was performed using the THUNDERBIRD SYBR qPCR Mix (Toyobo, Osaka, Japan) and the CFX96 system (BioRad, Hercules, CA, USA), according to the manufacturer’s instructions. The expression level of the internal control, β2 microglobulin, was simultaneously quantified. The primers are listed in Supplementary Table 2. Differences in ACSL5 mRNA expression among three genotype groups were evaluated by the Kruskal–Wallis test followed by the Steel–Dwass test. Statistical analyses were conducted using the JMP15.1 program (SAS Institute Inc., Cary, NC, USA).

Statistics and reproducibility

Software and database used for the data analysis of this study are as follows: PLINK version 1.90b5.1 (https://www.cog-genomics.org/plink2), IMPUTE4 (https://jmarchini.org/impute-4/), LocusZoom version 1.4 (https://genome.sph.umich.edu/wiki/LocusZoom_Standalone), METAL version 2011-03-25 (http://csg.sph.umich.edu/abecasis/metal/index.html), MAGMA version 1.07 (https://ctg.cncr.nl/software/magma), 2KJPN (https://ijgvd.megabank.tohoku.ac.jp/), BioBank Japan Project (https://biobankjp.org/english/index.html). Statistical analyses for mRNA expression analyses were conducted using the JMP15.1program (SAS Institute Inc., Cary, NC, USA).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.