Performing exome sequencing in 14 autosomal dominant early-onset Alzheimer disease (ADEOAD) index cases without mutation on known genes (amyloid precursor protein (APP), presenilin1 (PSEN1) and presenilin2 (PSEN2)), we found that in five patients, the SORL1 gene harbored unknown nonsense (n=1) or missense (n=4) mutations. These mutations were not retrieved in 1500 controls of same ethnic origin. In a replication sample, including 15 ADEOAD cases, 2 unknown non-synonymous mutations (1 missense, 1 nonsense) were retrieved, thus yielding to a total of 7/29 unknown mutations in the combined sample. Using in silico predictions, we conclude that these seven private mutations are likely to have a pathogenic effect. SORL1 encodes the Sortilin-related receptor LR11/SorLA, a protein involved in the control of amyloid beta peptide production. Our results suggest that besides the involvement of the APP and PSEN genes, further genetic heterogeneity, involving another gene of the same pathway is present in ADEOAD.
Autosomal dominant early-onset Alzheimer disease (ADEOAD) is a rare condition, for which prevalence is estimated at 5.3 per 100 000 persons at risk.1 Thanks to a nationwide recruitment, we have screened 130 ADEOAD families for mutations on known genes and we had previously detected missense mutations in the amyloid precursor protein (APP), presenilin1 (PSEN1) and presenilin2 (PSEN2) genes in 14, 82 and 7 families, respectively. In negative families, a search for copy number variants had allowed us to describe duplications of the APP gene2 in 11 families and, more recently, to identify 2 rare copy number variants targeting genes involved in amyloid beta (Aβ) peptide processing or signaling.3 After this screening, genetic determinism still remained unexplained in 14 families. Unfortunately, lack of DNA for affected relatives precluded a linkage analysis in these kindreds. We thus decided to study these families by exome sequencing, searching for gene(s) recurrently hit by unknown mutations.
Subjects and methods
A total of 14 unrelated index cases from families consistent with autosomal dominant inheritance were included in the study. In all, 13 patients fulfilled the NINCDS-ADRDA (the National Institute of Neurological and Communicative Disoders and Stroke and the Alzheimer's Disease and Related Disorders Association) criteria for probable Alzheimer's disease (AD) and one for definite AD. Cerebrospinal fluid biomarkers were available for nine patients, and in all cases, supported the diagnosis. No specific clinical or imaging phenotypes were noted. These patients belonged to families with at least two AD cases with onset <65 years in at least two generations (except in one family, in which several siblings were affected with a strong censoring effect in two previous generations). In these subjects, mutations within PSEN1, PSEN2 and APP (exons 16 and 17) had previously been excluded. These patients had also been negatively screened for copy number variants by using human high-resolution comparative genomic hybridization array Kit 1 × 1 M (Agilent Technologies, Santa Clara, California, USA).
The replication sample included 15 unrelated patients from EOAD families originating from France, Italy, Portugal, UK and Algeria, ascertained according to the same criteria as the initial sample, except that information on relative's clinical status was less precisely recorded, and therefore, a NINCDS-ADRDA diagnosis of possible AD was allowed in relatives. Mutations within PSEN1, PSEN2 and APP (exons 16 and 17) had previously been excluded for all patients.
A sample of 1500 controls (mean age: 73.8±5.4 years, 39% of whom were males) was selected from the prospective 3 cities study according to the following criteria at 7 years of follow-up: (i) free of dementia, (ii) mini–mental state examination score 27 and (iii) no mini–mental state examination score variation >1 between inclusion and follow-up. Briefly, the 3 cities study4 is a population-based, prospective study of the relationship between vascular factors and dementia that has been carried out in three French cities: Bordeaux (southwest France), Montpellier (south France) and Dijon (central eastern France). The initial sample included 9686 non-institutionalized subjects of age >65 years, randomly selected from the electoral rolls of each city between January 1999 and March 2001.
The Exome Sequencing Project consortium data set (snp.gs.washington.edu/EVS/; November 2011 release) includes exome data of 5379 American individuals from African or Caucasian origin, unchecked for cognitive status.
Genomic DNA was extracted from whole blood and sheared by sonication to obtain an average fragment size of 150–200 bp. DNA (3 μg) from each individual was used for the construction of a shotgun-sequencing library, using paired-end adaptors. Exome capture was performed using the SureSelect Human All Exon kits 38 Mb version 1 (Agilent) (n=12) or SureSelect Human All Exon kits 44 Mb version 2 (Agilent; n=2). Sequencing was performed on an Illumina Genome Analyser GAIIX (n=12) or an Illumina HiSeq 2000 (Illumina, Inc, San Diego, CA, USA) (n=2) following the manufacturer's instructions. Raw image files were processed by using the Illumina pipeline (CASAVA 1.7). The sequencing reads were aligned to the NCBI human reference genome (NCBI36.3 (n=12) or NCBI 37(n=2)), using ELANDv2. After variant calling, the low-quality variations were filtered out using a QPhred threshold<10. This score was calculated by using two criteria based on sequencing depth and base-quality score. Filtration of unknown variations was performed using the Exome Variation Analyser, our in-house software (http://litis.insa-rouen.fr/EVA/index.php). Sanger sequencing of PCR amplicons from genomic DNA was used to confirm the presence of variants identified via exome sequencing and to screen these variants in relatives. Complete sequencing of the 48 exons of the SORL1 gene was performed in the replication sample on an ABI 3730 (Applied Biosystems, Courtaboeuf, France) automated sequencer using specific primers (Supplementary Table 4). Variant frequencies were determined in controls using the KASPar technology (Kbiosciences, Herts, England, UK).
Blood was collected into PAXgene Blood RNA tubes (Qiagen, Hilden, Germany). Total RNA was extracted from whole blood using the PAXgene Blood RNA kit (Qiagen). RNA was then reverse-transcribed using the Verso cDNA kit (Fisher Scientific, Illkirch, France) with a blend of random hexamers and anchored oligo-dT (3:1). The cDNA was PCR-amplified using primers flanking the mutation. PCR products were sequenced using the BigDye V3.1 Terminator Kit (Applied Biosystems) and an automated sequencer (ABI 3100; Applied Biosystems). Polymorphism rs 12364988 and stop mutation c.4434C>A were used as markers to measure the allele-specific expression of the two alleles using the SNaPshot (PE Applied Biosystems, Foster City, CA, USA) technique.
Results and discussion
A total of 14 unrelated early-onset index cases from nuclear families consistent with autosomal dominant inheritance were included in the study. We generated an average of 4.5 Gb of sequence per affected individual as paired-end, 76 bp reads, with a mean coverage of 65-fold for the first 12 exomes and 80-fold for the two exomes of the second batch. At this depth of coverage, 86% of the targeted bases had >10 high-quality reads. On average, 15 600 or 20 028 exonic variants were identified per exome according to the capture protocol used (Supplementary Table 1). To remove previously reported common variants, subdivided into non-synonymous/splice acceptor and donor site/frameshift coding indels (NS/SS/I), we filtered out all heterozygous variations against db SNP131. An average of 310 NS/SS/I variants (455 for the second batch) was retained per individual (Supplementary Table 2). These variants were then further filtered against the 1000 genomes project data set (May 2011, 20101123 release, www.1000genomes.org), Hapmap (www.hapmap.org), and our in-house database, including 72 exomes from unrelated individuals with non-neurodegenerative diseases sharing the same ethnic origin as our patients. At the completion of this analysis, genes harboring at least one unknown variant were then classified according to their recurrence in the patient sample (Supplementary Table 3). Although most of these genes harbored unidentified NS/SS/I variants in a single individual, some genes harboring unidentified NS/SS/I variants were shared between several individuals. The 14 patients did not have in common a single altered gene, indicating that, within this sample, the disease was genetically heterogeneous. After validation of the variants by Sanger sequencing, the top-scoring gene was SORL1, harboring unknown mutations in 5/14 exomes (Table 1 and Supplementary Table 3). SORL1 encodes the sortilin-related receptor LR11/SorLA, a 250 kDa transmembrane neuronal sorting protein that binds APP and directs it's trafficking into the recycling endosome pathways, leading to production of the Aβ peptide. Underexpression of SORL1 results in an increase of Aβ production.5 SORL1 has already been the subject of numerous association studies focused on frequent polymorphisms (for a complete overview, see Alzgene, www.alzgene.org), yielding mixed results, although a recent meta-analysis, including more than 12 000 cases and 17 000 controls, concluded that multiple SORL1 alleles in distinct linkage disequilibrium blocks are associated with AD risk.6
We also determined the allele frequency of each variant in a cohort of 1500 control individuals matched for ethnic origin. This work confirmed that all variants were indeed private variants not found in controls. Cosegregation analysis among affected relatives was possible in a single family and showed that the p.Gly511Arg variation identified in the index case was also found in the affected mother (Supplementary Figure 1a).
We next sequenced SORL1 in a replication sample including 15 index cases from additional ADEOAD families. We detected one missense and one nonsense mutation not present in dbSNP131, the 1000 genomes data set or our control exomes (Table 1 and Supplementary Figure 1b). This distribution did not significantly differ from that found in the initial sample (P=0.21, Fisher's exact test). In a third patient, we also detected an unknown silent mutation (g.121448030C>G), that according to the NNSplice software (www.fruitfly.org/seq_tools/splice.html), has no effect on splice sites and was not further considered.
In summary, unknown SORL1 variations were present in 7/29 families (Table 1). To rule out the possibility that the high frequency of unknown variations found in patients was driven by a nonspecific high mutation rate in this gene, we examined our 72 control exomes and the exomes of the Exome Sequencing Project data set for the frequency of SORL1 NS/SS/I coding variants not present in dbSNP131, Hapmap or the 1000 genomes data set. Compared with a frequency of 7/29 in the sample of EOAD patients, this frequency was 1/72 in our control exomes (P=0.0006, Fisher's exact test) and 132/5379 in exomes from the Exome Sequencing Project data set (P=6.2 × 10−6, Fisher's exact test), demonstrating that, despite its large size, SORL1 gene is not particularly prone to harbor a burden of unidentified mutations. Then, we conclude that the enrichment in unknown variations observed in our EOAD sample was disease-specific. Among the seven variations identified in EOAD patients, six were not retrieved in the 5379 individuals from the Exome Sequencing Project consortium for which exome data have recently been generated, and one (p.Asn924Ser) was detected in a single individual.
As peripheral blood cell RNA was available for individual EXT 050, we performed RT-PCR experiments to assess the functional consequence of the p.Cys1478X truncating mutation (Supplementary Figure 2). We observed a clear allelic imbalance of SORL1 expression, suggesting that the mutant allele was subjected to nonsense-mediated mRNA decay. The other identified variants were missense substitutions. Then, in an attempt to evaluate the predicted functional impact of each missense variation, we examined the evolutionary conservation of affected nucleotides using the Genomic Evolutionary Rate Profiling score, and the potential of the mutations to affect the structure and function of the protein using the Grantham score (Table 1). In silico analysis was conducted using MutationTaster (www.mutationtaster.org/),7 polyphen2 (genetics.bwh.harvard.edu/pph2/) and SIFT (sift.jcvi.org/; Table 1). All missense variants were classified as probably damaging by at least one prediction software and three out of five were considered as pathogenic by the three softwares. SorLA is a multidomain type 1 transmembrane protein, which binds several ligands (Supplementary Figure 3). One mutation (p.Asn1358Ser) lies in the low-density lipoprotein receptor class A repeats that are essential for APP binding.8 Another mutation (p.Asn924Ser) is located within the β-propeller+epidermal growth factor domain. This mutation is located near two YWTD repeats that are essential for the correct folding of the propeller. The p.Tyr141Cys and p.Gly511Arg mutations are located within the VPS10p domain that binds several ligands, including the receptor associated protein. Binding of the receptor associated protein to SORL1 almost completely inhibits the APP/SorLA interaction.8 Specifically, the p.Gly511Arg mutation targets a highly conserved residue present in proteins containing a VPS10p domain.9 Finally, the p.Gly1681Asp mutation lies in the less conserved fibronectin cluster that contains six fibronectin type III repeats.
In this study, we were dealing with the difficult issue of analyzing exome data using a classical recurrence-based filtration strategy in a Mendelian disorder with genetic heterogeneity.10 Nevertheless, this approach allowed us to identify as the top-scoring gene SORL1, a gene that is a strong candidate for AD and that belongs to a biological pathway related to Aβ metabolism. We have demonstrated an enrichment of rare variants exclusive to ADEOAD patients for this gene. Our results, obtained after a pangenomic screening, were confirmed in a replication sample. In addition to two nonsense mutations that mimic the effects of siRNA suppression and therefore are likely to result in Aβ overproduction,5 in silico analyses predicted a deleterious effect on protein structure and function for at least three, and possibly five missense variations. We conclude that besides the involvement of the APP and PSEN genes, further genetic heterogeneity, involving another gene of the same pathway, is present in ADEOAD. However, due to the lack of sufficient segregation data, we cannot exclude that SORL1 mutations result in incompletely penetrant ADEOAD. This issue will require segregation studies in additional SORL1 families.
This work was supported by a grant from the Clinical Research Hospital Program from the French Ministry of Health (GMAJ, PHRC 2008/067), to DH and DC, and sponsored by the University Hospital of Rouen. We thank the Integragen society, which performed exome sequencing. We thank the LITIS and the TIBS team for providing bioinformatics support, André Blavier for the Alamut software, Mario Tosi for helpful discussions, and Tracey Avequin for editing the manuscript.
PHRC GMAJ collaborators
The investigators of the French GMAJ project include Didier Hannequin, Dominique Campion, Olivier Martinaud, Lucie Guyant-Maréchal and David Wallon (Centre Hospitalo Universitaire (CHU), Rouen); Olivier Godefroy and Candice Picard (CHU Amiens); Frédérique Etcharry-Bouyx (CHU Angers); Eric Berger (CHU Besancon); Jean-Francois Dartigues and Sophie Auriacombe (CHU Bordeaux); Vincent de la Sayette (CHU Caen); Francois Sellal (CH Colmar); Olivier Rouaud and Christel Thauvin (CHU Dijon); Olivier Moreaud (CHU Grenoble); Stéphanie Bombois, Adeline Rollin-Sillaire, Marie-Anne Mackowiak and Florence Pasquier (CHU Lille); Isabelle Roullet-Solignac and Alain Vighetto (CHU Lyon); Mira Didic, Olivier Félician and Mathieu Ceccaldi (CHU Marseille); Audrey Gabelle and Jacques Touchon (CHU Montpellier); Martine Vercelletto and Claire Boutoleau-Bretonnière (CHU Nantes); Pierre Labauge and Giovanni Castelnovo (CHU Nimes); Claire Paquet and Jacques Hugon (CHU Lariboisière); Agnès Michon, Isabelle Le Ber and Bruno Dubois (CHU La Salpêtrière, Paris); Catherine Thomas-Antérion (CHU Saint-Etienne); Frédéric Blanc and Christine Tranchant (CHU Strasbourg); Jérémie Pariente, Michèle Puel and Jean-Francois Demonet (CHU Toulouse); Caroline Hommet and Karl Mondon (CHU Tours); Hélène Mollion and Bernard Croisile (CMRR CHU Lyon); Mathilde Sauvée (CHU Nancy); Gaelle Godenèche and Foucauld De Boisgueheneuc (CHU Poitiers).
About this article
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)
SORL1 genetic variants and Alzheimer disease risk: a literature review and meta-analysis of sequencing data
Acta Neuropathologica (2019)