Genetic analysis of ALS cases in the isolated island population of Malta

Genetic isolates are compelling tools for mapping genes of inherited disorders. The archipelago of Malta, a sovereign microstate in the south of Europe is home to a geographically and culturally isolated population. Here, we investigate the epidemiology and genetic profile of Maltese patients with amyotrophic lateral sclerosis (ALS), identified throughout a 2-year window. Cases were largely male (66.7%) with a predominant spinal onset of symptoms (70.8%). Disease onset occurred around mid-age (median age: 64 years, men; 59.5 years, female); 12.5% had familial ALS (fALS). Annual incidence rate was 2.48 (95% CI 1.59–3.68) per 100,000 person-years. Male-to-female incidence ratio was 1.93:1. Prevalence was 3.44 (95% CI 2.01–5.52) cases per 100,000 inhabitants on 31st December 2018. Whole-genome sequencing allowed us to determine rare DNA variants that change the protein-coding sequence of ALS-associated genes. Interestingly, the Maltese ALS patient cohort was found to be negative for deleterious variants in C9orf72, SOD1, TARDBP or FUS genes, which are the most commonly mutated ALS genes globally. Nonetheless, ALS-associated repeat expansions were identified in ATXN2 and NIPA1. Variants predicted to be damaging were also detected in ALS2, DAO, DCTN1, ERBB4, SETX, SCFD1 and SPG11. A total of 40% of patients with sporadic ALS had a rare and deleterious variant or repeat expansion in an ALS-associated gene, whilst the genetic cause of two thirds of fALS cases could not be pinpointed to known ALS genes or risk loci. This warrants further studies to elucidate novel genes that cause ALS in this unique population isolate.


Introduction
Amyotrophic lateral sclerosis (ALS) is an adult-onset, rapidly progressing, neurodegenerative disease. Onset is typically accompanied by clinical signs of upper and/or lower motor neuron degeneration and patients usually present with weakness in the bulbar muscles, only the limbs, or both regions simultaneously. About 15-20% of persons with ALS experience progressive cognitive decline, leading ultimately to dementia. The condition known as frontotemporal dementia results from the degeneration of the frontal and temporal lobes [1,2]. The incidence of ALS in European populations is two to three cases per year per 100,000, whereas prevalence can reach 10/100,000 [3]. ALS is classified as familial (fALS) in the presence of a clear family history of the disease and sporadic (sALS) when this is absent. The rate of fALS among prospective population-based registries is about 5% [4]. Nonetheless, studies on twins and European case-control cohorts have found that heritability of ALS approximates 40% [5][6][7]. To date, variants in any of more than 40 genes have been reported to cause monogenic fALS with more than half of the cases explained by highly penetrant causal variants residing in C9orf72 (23%), SOD1 (19%), TARDBP (3%) or FUS (3%) [8]. Variants in these genes [8][9][10] and genetic risk loci that include ATXN2 [11], UNC13A [12], SARM1 [13], C21orf2, SCFD1 and MOBP [13] have also been exposed in sALS cases. Genetic variants are estimated to contribute to between 14 and 17% of European sALS patients or those with a European ancestry [9,10]. All this underscores the substantial contribution of genetic factors to ALS disease aetiology.
Genetic discoveries often lead to novel insights into the molecular mechanisms of ALS. For instance, clustering genes according to their known physiological functions identified ribostasis, proteostasis and cytoskeletal dynamics as key cellular pathways involved in ALS pathogenesis [14]. Importantly, discovery of genes is leading to the development of genotype-specific treatments [15]. Additional genetic factors remain to be found for ALS. The use of geographically and/or culturally isolated populations for mapping novel ALS genes is a strategy that remains relatively unexploited despite benefits that include founder effects, reduced genetic diversity and minimal environmental heterogeneity [16]. This spurred us to study the native population of Malta, a sovereign microstate in the middle of the Mediterranean Sea. Consisting of an archipelago of three inhabited islands (total area 316 km 2 ), Malta's population presently numbers about 514,564, based on Malta's National Statistics Office (NSO) data in 2019. Population seeding events occurred more than 7000 years ago by settlers coming from neighbouring Sicily and based on mitochondrial DNA, Y-chromosome and autosomal DNA marker analyses, influences by colonisers in the centuries that followed were minimal [17][18][19]. The Maltese are the only European population that speak a Semitic language, further underscoring their relative isolation from other communities inhabiting Europe. Previous successes in gene mapping of rare diseases in the population of Malta are encouraging [20,21]. Here, we first investigate the incidence and prevalence of ALS in the Maltese islands, in a 2year period. Second, we perform an initial genetic survey by reporting on rare variants in protein-coding regions and consensus splice sites of all presently known monogenic ALS genes and genetic risk loci in Maltese ALS patients compared to matched controls.

Participants
Participant surveillance and recruitment occurred throughout a 2-year window, from 2017 through 2018. Patients diagnosed with probable or definite ALS, referred by either the national Motor Neuron Disease association, consultant neurologists, general practitioners and neurophysiology units were invited to participate in our study. Alternatively, patients or their relatives made direct contact with our laboratory expressing willingness to participate in the study. Patient participants met the revised El Escorial criteria for ALS [22,23]. Patients with fALS were identified as having a self-reported family history of ALS, or probable ALS, defined as the presence of at least one first-degree relative. In total, 24 patients were enrolled in this study. Blood sampling was excluded for one sALS case in view of the patient's deteriorating condition. Affected family members for all fALS cases were deceased precluding us from sampling them. Controls, which were ascertained in a roughly 2:1 case-control ratio, matched patients for age, sex and geographical region. Ethical approval for the collection of samples, study design and the creation of the Malta ALS/ MND Register and Biobank was given by the Research Ethics Committee of the University of Malta. Written informed consent to participate was sought from all patients and/or family members as well as controls.

Phenotypic information
Phenotypic information was gathered via a detailed questionnaire in addition to a clinical examination. Each sample is therefore accompanied by a core dataset that includes age, sex, occupation, site of onset, date of disease onset, family history, ALS Functional Rating Scale-revised (ALSFRS-R) score, muscle tone for both upper and lower limbs, muscle power graded according to the MRC scale, and status of reflexes. Information on possible environmental risk factors including physical activity, cigarette smoking and alcohol consumption was also gathered. A biochemical assay to test creatine kinase (CK) levels at recruitment was also utilised.

Incidence and prevalence calculations
The denominator for the calculation of the incident rate was the sum of total population of Malta in 2017 and 2018. During the study period, the population of Malta increased from 475,701 to 493,559. Separate incident rates for males and females as well as specific age groups were also calculated. Population numbers were derived from NSO data. The prevalence rate was estimated on 31 st December 2018. Confidence limits for incidence were calculated assuming a Poisson distribution.

Whole-genome sequencing
Extraction of DNA occurred from whole EDTA-containing venous blood samples using the QIAamp DNA Mini QIAcube Kit and DNA integrity was measured using the Quantus fluorometer. DNA was whole-genome sequenced by the BGISEQ-500 platform (BGI, Hong Kong, China) to generate 100 bp paired-end reads with an average depth of 30×. Reads were aligned to the GRCh37 (HG19) reference genome using Burrows-Wheeler Aligner software. Single nucleotide variant (SNV) and small insertion and deletion (indel) calling and quality filtering were performed using the Genome Analysis Toolkit (GATK). The ExpansionHunter tool was used to analyse repeat sizes of ATXN2 (NM_002973.3: c.496_498CAG), C9orf72 (NM_001256054.2:c.-45 +163GGGGCC) and NIPA1 (NM_144599.4:c.24_26GGC) [24]. In order to estimate the genetic ancestry in relation to reference maps of diverse populations, principal-component analysis (PCA) was performed on LASER with results plotted using the LASER Server plot facility (https://laser.sph.umich. edu/). HomozygosityMapper (www.homozygositymapper. org) [25] was used to map runs of homozygosity (ROHs).

Variant analysis
We searched for and analysed protein-coding and splice-site altering variants and indels in 58 established ALS causative or risk genes (Table 1). We restricted analyses to variants with European minor allele frequency (MAF) ≤0.01, which corresponds roughly to the European frequency of the recently discovered ALS risk NM_004984.2:c.2957C>T; p.(Pro986-Leu) allele in the KIF5A gene [26]. Where available, variants were then annotated with information from the dbSNP database including European-specific MAF estimates from the Genome Aggregation Database (gnomAD). Allele frequencies for ALS cases and controls within the Project MinE dataset were extracted from the Project MinE databrowser [27]. To determine variant pathogenicity, MetaSVM and MetaLR, two ensemble-based prediction methods integrating multiple scoring systems, were used in view of their superior predictive ability relative to other methods [28]. Variants were considered as damaging if outcomes of both methods concurred. Indels and splice-site acceptor/donor variants were automatically classified as deleterious. Variants and the associated phenotypes have been submitted to the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/) with accession numbers SCV001426191, SCV001426206-SCV001426210, SCV001426221-SCV001426223 and SCV001437161-SCV001437192. New variants detected in the Maltese casecontrol cohort were submitted to the dbSNP database (https://www.ncbi.nlm.nih.gov/snp/) with the submission SNP IDs of ss2137544106, ss3986090479, ss3986090480, and ss3986090481.

Statistics
Comparisons between means were made with the unpaired, two-tailed Student t test, whereas comparison between categorical variables was made with χ 2 test. A p value <0.05 was considered significant. Data were processed with GraphPad Prism v8.4.0 software.

Baseline characteristics
The key characteristics of patients and controls are detailed in Table 2. ALS cases were largely male with a predominant spinal onset of symptoms. Bulbar onset ALS had a slightly higher occurrence in females (57.2%). Disease onset occurred around mid-age with median age at onset being lower in females (59.5 years) compared to males (64 years). Only one male and female patient was under 45 years old (8.3%). Early ALS (≤55 years) was more frequently spinal at onset (spinal/bulbar ratio = 9:1). A family history of ALS was recorded for a minority of cases (Table 2). However, considering their familial aggregation with ALS [29][30][31], the inclusion of both a family history of dementia and neuropsychiatric endophenotypes-like schizophrenia or psychosis, increases the proportion of fALS to 37.5% (9/24 patients). The mean duration of the illness was 44.5 ± 28.3 SD months, with males having a faster progression compared to females (29.6 ± 8.8 SD months vs. 66.8 ± 33.4 SD months, p = 0.0065). Site of disease onset did not influence disease duration (spinal onset = 43.5 ± 32.4 SD months, bulbar onset = 43.7 ± 19.8 SD months, p = NS).
One-third of the ALS patients recruited had a history of heavy smoking and more than half reported an occupation associated with strenuous activity, both of which have been implicated as environmental risk factors for ALS [32,33]. Alcohol abuse in the patient cohort was minimal. In view that a number of years have elapsed from onset up to recruitment, the levels of CK in the patient cohort were on average only slightly above the normal range. The mean ALSFRS-R score in ALS recruits was nearly half that recorded for the control subjects with the score in the latter nearing 48, the maximum expected in healthy individuals. The distribution of cases and controls throughout the Maltese islands are displayed in Fig. 1. A higher population density in the southeast of mainland Malta, most probably explains the increase in the number of cases in this geographic region relative to other regions.

Incidence and prevalence rates
The annual incidence rate for ALS in the 2017-2018 period was 2.48/100,000 person-years (95% CI 1.59-3.68). The male-to-female incidence ratio was 1.93:1. Hence, incidence rate was higher for men (3.25, 95% CI 1.86-5.28) than for women (1.68, 95% CI 0.72-3.31), and this trend occurred across all age groups after 49 years (Fig. 2). For both men and women, the incidence increased with age but declined after age 79. Peaks occurred in the 50 to 59 age group among men and in the 70 to 79 age group among women. A total of 17 patients (male = 12, female = 5) were  (Fig. 3). A genetic affinity with Middle East populations is also apparent in Sicilians [34,35], further supporting a common genetic ancestry for populations inhabiting the Mediterranean islands of Malta and Sicily.

Repeat expansions in C9orf72, ATXN2 and NIPA1 genes
Despite studies which show that C9orf72 is the major gene that is mutated in the European ALS population [8,36,37], we did not identify pathogenic hexanucleotide (GGGGCC) repeat expansions (≥24) in C9orf72 in either fALS or sALS cases (Fig. 4). The expansion repeat size ranged from 2 to 10 in controls and from 2 to 13 in ALS     2 Change in incidence rate of ALS with age in Malta. Incidence rate increases with age and was higher for males compared to females across all age groups after 49 years. Peaks occur in the 50-59 age group for men and in the 70-79 age group for women. patients. Consistent with previous studies [36,37], a repeat length of 2 was the most predominant in either group (42.3% in controls and 52.2% in ALS cases). In addition to C9orf72, repeat expansions in other genes including ATXN2 and NIPA1 have been associated with increased risk of ALS [11,38]. We identified one male patient with fALS that possessed ATXN2 ALS-associated trinucleotide repeat expansions (28 repeats in length) in the homozygous state (Fig. 4). At the age of 67 years, this patient first experienced bilateral leg weakness that progressed. He subsequently developed dysarthria and succumbed to the disease within 2 years of disease onset. This patient was also the only one in our cohort that showed signs of cognitive impairment. Family history was notable for a deceased sister who had ALS with early onset in her late 30s, and a deceased mother who had dementia (DNA samples were not available for study). The pedigree is shown in Supplementary Fig. S1. The maximum ATXN2 repeat size observed in healthy controls was 25, encountered in the heterozygous state, in one subject. All remaining ALS patients and controls had repeat lengths ≤23. As was reported previously [11,39]

Genetic variants in known ALS-associated genes
Interestingly, the Maltese ALS patient cohort was found to be negative for non-synonymous or splice-site altering SNVs in the SOD1, TARDBP or FUS genes, which are the most commonly mutated ALS genes, in that order, following C9orf72 [8]. After examining 58 ALS-associated genes in our patient and control cohort, we identified 35 rare (European MAF ≤ 0.01) coding variants that were present in Maltese ALS patients and absent in controls (Table 3). Three SNVs in DDX20, EWSR1 or GLE1 were not found in the dbSNP (v141) database. The NM_001003722.1: c.2078C>T; p.(Ser693Phe) variant in GLE1 was however reported in a recent study as a founder mutation in Maltese that on homozygosis induces a motor dysfunction syndrome that presents at childhood [21]. Variants predicted to be damaging by both MetaSVM and MetaLR were detected in ALS2, DAO, DCTN1, ERBB4, SCFD1 and SPG11 (Table 3). All were detected in patients with sALS. The NM_001917.4:c.250G>A; p.(Ala84Thr) variant in DAO was detected in two patients, whereas one female patient possessed deleterious variants in more than one gene (DAO and DCTN1).    Analysis of indels allowed us to identify a rare deletion in SETX, detected in the heterozygous state in a sALS case (Table 3). This patient presented with upper limb weakness at the age of 70. One year later, on follow-up, weakness spread to the lower limbs. The deletion is predicted to result in a frameshift, consequently producing a truncated SETX protein lacking the helicase domain. It is noteworthy that the damaging ALS2, SCFD1, and SETX variants detected in Maltese sALS patients were found to have higher allele frequencies in ALS patients within the Project MinE casecontrol dataset [27], thereby underscoring their probable pathogenicity (Table 3). Rare SNVs or indels that were unique to controls or that were shared by ALS patients and controls are listed in Supplementary Table S1. One SNVs in FIG4 was not found in the dbSNP (v141) database. No homozygous stretches were overrepresented in sALS patients compared to controls (Supplementary Fig. S2). To estimate the genetic risk for ALS in the Maltese population, we determined that the percentage of sALS cases caused by rare and potentially deleterious variants (absent in controls) in at least one ALS-associated gene was 40% (8/20 patients). Two fALS cases did not carry any mutations in known ALS genes or risk loci, hence, warranting further studies to elucidate novel genes that cause ALS.

Discussion
In our work, we investigated the characteristics of Maltese ALS patients, described their genetic profile, and determined the incidence and prevalence of ALS on the Maltese islands. It is interesting that the population-specific aspects of our ALS cases overlap those reported for other neighbouring European populations, especially those in the Mediterranean including the island of Sicily [41] and the southern region of Puglia [42,43] in Italy, Tunisia [44] and Cyprus [45]. Although the male preponderance in ALS is virtually universal [46,47], it is noteworthy that in these specific populations as well as in Malta, age at disease onset is higher in males than in females. This is in contrast to northern populations of the Mediterranean basin including Catalonia in Spain [48], and the regions of Emilia Romagna [49], Liguria [50] or Friuli-Venezia-Giulia [51] in Italy. Incidence and prevalence of ALS in Malta is similar to the European median [3].
We report a higher percentage (12.5%) of fALS cases in Malta, close to that reported for the northern Italian population of Liguria (10%) [50] but nearly half that reported for the island of Sardinia (26.7%) [52]. Nonetheless, similar to previous studies [53], relaxing the stringent criteria by including neurological conditions in kindreds that have a genetic overlap with ALS [54], can increase the fALS percentage in Malta by threefold (37.5%). In agreement, considering the Maltese sALS subset, we showed that, compared to other European populations [10] or populations of European ancestry [9], a higher proportion of seemly sALS patients are probably genetically determined.
Intriguingly, rare deleterious variants in the major ALS genes, including C9orf72, SOD1, TARDBP and FUS, were absent in Maltese ALS patients. This finding confirms the presence of a North-South gradient in the frequency of mutations within these genes across Europe. Hence, C9orf72 or SOD1 mutations in fALS are highest in northern European countries like Belgium and Finland, whereas a relatively low frequency is recorded in the south of Europe including Spain and mainland Italy [8]. A similar situation can be observed for TARDBP and FUS [8]. Our study thus underscores the marked differences that exists between ethnic groups and geographical regions with respect to the genes that are commonly implicated in ALS.
Maltese ALS patients nevertheless possessed deleterious alleles in 'minor' ALS genes including ALS2, ATXN2, DAO, DCTN1, ERBB4, NIPA1, SETX, SCFD1 and SPG11. ALS2 and SPG11 have been associated with juvenile-onset ALS only under a recessive disease model [55][56][57][58]. In this context, since we observed variants in the ALS2 and SPG11 genes solely in heterozygous configurations and in patients which had adult-onset ALS, it is likely that these alleles were not disease causing in the patients that possessed them. However, considering that ALS has an oligogenic basis [59], a modifying or additive effect cannot be excluded. The same can be said for SCFD1, which has only been recently identified as a risk locus [13], and for which we report a damaging variant in an ALS patient with a young age of onset (27 years) and whose disease progression is exceptionally slow.
ATXN2, DAO, DCTN1, ERBB4, NIPA1 and SETX have all been previously associated with ALS having an autosomal dominant mode of inheritance [11,38,40,[60][61][62][63]. Damaging alleles discovered in our patient cohort, which specifically target these genes, are most probably causative. It is interesting to note that the ERBB4 c.3814G>A; p. (Gly1272Arg) variant reported in this study is extremely close to the one reported in a Japanese sALS individual [c.3823C>T; p.(Arg1275Trp)], both of which are located in the C-terminal domain of the protein, close to multiple phosphorylation sites, which mediate downstream signalling pathways [61]. Although SETX variants have been initially discovered in juvenile-onset ALS patients [62], reports have since described damaging alleles in patients with adult-onset ALS [10,64]. This is in line with our study, hence, the Maltese ALS patient possessing a SETX deletion had a late age of onset similar to the one reported in a previous case study [64].
Our findings have important implications. Incidence and prevalence of ALS in Malta as well as patient population aspects overlap those of neighbouring countries. However, supported by genetic ancestry results, the genetic architecture of ALS in Malta appears to be different from the European average underscoring genetic isolation imposed by geography. This combined with the lack of an identified genetic factor in two-thirds of Maltese fALS cases, encourages further studies aimed at discovering novel ALS genes. Our 'preliminary' data excludes the possibility that these patients have deleterious variants in a set of genes associated with other motor neuron disorders including hereditary ataxias, and hereditary motor and sensory neuropathies (data not shown). Finally, variants described in this work should spur the generation of animal models to confirm causation and better understand disease mechanisms [65]. This is imperative especially for 'minor' ALS genes, given that they are relatively less studied than 'major' genes, but which are nonetheless consequential in specific populations.
Acknowledgements The authors are grateful to Prof. Richard Muscat, Wilfred Kenely and Dr. Graziella Zahra for their support of this work. Thanks also goes to Matthew Camilleri for unwavering technical and administrative support. We remain indebted to all participants of this study.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.