Introduction

The most common cause of dementia in the elderly is Alzheimer's disease (AD), characterized by a progressive decline of cognitive functions, particularly memory. The neuropathological hallmarks of the disease are numerous cortical amyloid plaques formed by aggregation of amyloid β-peptide (Aβ), neurofibrillary tangles containing hyperphosphorylated tau protein and atrophy. There are different hypotheses underlying the pathology of AD and the amyloid cascade hypothesis is the most prevalent.1, 2, 3 Oligomerization and accumulation of the Aβ1−40 and Aβ1−42 peptides, as well as failure of normal degradation of the peptides, are the key events in this amyloid cascade hypothesis. Sporadic AD accounts for the majority of cases and familial AD (FAD, defined as at least two affected first-degree relatives) constitutes a smaller fraction of all AD cases. Less than 1% of all AD cases are autosomal dominant, early-onset (before 65 years of age) monogenic forms caused by mutations in one of the three known genes: amyloid precursor protein (APP, OMIM: no. 1043004, 5), presenilin 1(PSEN1, OMIM: no. 6078226) and presenilin 2 (PSEN2, OMIM: no. 6068897, 8). The functions of these three genes fit well with the amyloid cascade hypothesis concerning the AD aetiology, as the encoded proteins are all involved in APP processing and/or are reported to increase Aβ production.9, 10, 11, 12 In contrast, the ɛ4 allele of the APOE gene is the only consistently confirmed genetic risk-factor for the sporadic forms of the disease, which predominantly acts by reducing the age at onset of disease in a dose-dependent manner.13

There are today a handful of published articles involving linkage analysis on FAD, and even though different research groups have confirmed linkage to regions in chromosomes 6,14, 15, 16, 17, 18 9,14, 16, 17, 18, 19, 20 10q14, 16, 17, 18, 21 and 12p,14, 15, 17 no new genes with causal mutations or with increased effects on AD risk have been identified. Other genomic regions that have been shown to be linked to or associated with AD in previous studies are 1q23–31, 2p12–q11, 4p16, 4q35, 5p13–15 and 19q13.14, 16, 17, 18, 19, 21, 22, 23, 24, 25 For recent reviews and meta-analyses see http://www.alzforum.org/res/com/gen/alzgene/linkage.asp, Bertram et al,26 and Kamboh.27

Recently, we published the first Swedish genome-wide scan on 71 AD families (GS1) and found a significant linkage peak corresponding to the APOE region in chromosome 19q13.28 In the present study (GS2), we have added additional affected and unaffected individuals to the original 71 families as well as 38 novel families and increased the density of genotyped markers (from an average intermarker distance of 8.97 cM to one marker per 2.85 cM) by genotyping 1102 novel microsatellites. Thus, we report the results from a genome-wide linkage analysis based on the data generated by genotyping 1289 markers on 468 individuals from 109 Swedish AD families.

Materials and methods

Samples

The families were selected from our research registry of neurodegenerative dementias at Department of NVS, Karolinska Institutet, Huddinge, Sweden. The families were recruited either through referrals from primary caregivers, memory clinics or by self-referrals from all of Sweden. The inclusion criteria for the study were a positive family history for dementia (at least two affected first-degree relatives) and that DNA was available on at least two affected relatives in each family. The family history of dementia, the age at onset (the age when first symptoms appeared) and the disease course was based on medical records, autopsy reports, if available, and genealogy as well as through interviews of relatives. DNA was available on 292 affected (188 included in GS1) and 176 unaffected individuals from 109 families (Table 1). The average number of unaffected individuals genotyped in each family was 1 (range 0–13). The sample set included 284 women (64.4% affected) and 184 men (59.2% affected), with an average age at onset (AAO) of 68.7±7.5 (±1SD) years. Thirty-five of the families had an early-onset (family mean AAO ≤65 years of age), and 87 families had a history of affected individuals in two or more generations, consistent with a dominant inheritance pattern.

Table 1 Pedigree size distribution of the AD families included in the study

All families contained at least one affected family member with a clinical diagnosis of AD according to NINCDS-ADRDA criteria29 and/or neuropathological diagnosis of AD. In the majority of families, all cases had a clinical diagnosis of AD. In 24 families, there was at least one neuropathological diagnosis of AD according to CERAD criteria.30, 31 However, there were several families with more than one clinical dementia diagnosis both in a single individual and/or in different cases from the same family. Thus, several cases had a diagnosis of AD in combination with atypical AD signs such as predominant frontal symptoms, Lewy body signs, Parkinsonism and vascular components. A group of 18 families had a clinical diagnosis of ‘mixed AD’. The mixed AD families contained families where there were cases with a clinical diagnosis of AD as well as cases with a vascular dementia diagnosis (VaD, including stroke with dementia and multi-infarction dementia) or cases with a combination of both diagnoses. Furthermore, there were families with two different neuropathological diagnoses in two different siblings. Table 2 shows the distribution of the sub-phenotypes among the families based on the diagnoses presented in the affected individuals in the family: definitive AD (when at least one family member had autopsy-confirmed AD), clinical AD (all cases had clinical AD), AD and vascular dementia; when both AD and VaD were present in the family and/or a single case had mixed dementia diagnosis, AD and frontotemporal dementia, AD and Lewy body dementia and AD and Parkinson's disease.

Table 2 Subgrouping of the 109 AD families according to phenotypes present in the family

Of the 109 families, 63 were APOE ɛ4-positive (all affected carried at least one ɛ4 allele) and 46 were ɛ4 negative (at least one affected did not carry any ɛ4 alleles). Families with known mutations in the APP, PSEN1 and PSEN2 genes were excluded. Two of the included markers (D21S1914 and D21S1442) are located within 1.6 Mb of the APP gene, and any duplication resulting in three alleles would be detected in the quality-check procedure. However, duplicated segments smaller than 1.6 Mb and/or duplications resulting in two alleles would not be detected by this method, and thus quantitative PCR for detection of APP duplications is ongoing. Mutation screening in the granulin gene (GRN, MIM no. 138945) has not been performed. Genealogical studies have been made on the majority of the 109 families and shared ancestries have been identified in 12 families (ie 12 of the families have a genetic history in common with one other family). Three families, presently living in Sweden, originated from Finland in earlier generations, two from Norway and two from Germany, married into Swedish families.

Blood samples were collected after informed consent by participating individuals or next of kin, and the study was approved by the local Ethics Committee at the Karolinska Institutet, Huddinge, Sweden. DNA was extracted according to standard protocols. Whole genome amplification was performed on a total of 78 DNA samples either by us using the GenomiPhi DNA Amplification kit (GE Healthcare BioSciences AB, Uppsala, Sweden) or by deCODE genetics Inc., Iceland, because of small starting amounts of genetic material. The markers were genotyped at the genotyping service; deCODE genetics Inc., Iceland, using their 1000 marker panel. DNA samples from 468 individuals were genotyped for 1102 markers. Combining the genotypes from GS1 (187 non-redundant markers genotyped in 188 affected individuals) and the present genome-wide scan (GS2), there was a total of 1289 genotyped markers leading to an average intermarker distance of 2.85 cM. Intermarker distances and their order were obtained from deCODE and by combining the deCODE map32 with publicly available genetic maps from Marshfield and Généthon.33, 34 Marker allele frequencies were estimated from the families, which tends to give conservative results.35 The graphical presentation of the multipoint (mpt) results were converted to Zlr scores, which reflect the sign of dhat unlike Allegro LOD scores.

Statistical analyses

Linkage analysis was performed using the information of 1289 markers on the whole set of 109 families and in the APOE-stratified groups (63 APOE ɛ4-positive families and 46 APOE ɛ4-negative families) using the Allegro version 1.2,36 applying both mpt and singlepoint (spt) analyses. All unaffected siblings were coded as having unknown disease status in the calculations. In the subset of 46 ɛ4-negative families, the affection status for affected individuals carrying an ɛ4 allele was set as ‘unknown’. Thereby, these individuals do not contribute to the LOD score, but they add information about phase. Non-parametric allele sharing LOD scores, Zlr scores and NPL scores were obtained. The non-parametric model was used, as the pedigrees show mixed patterns of disease inheritance and the true underlying inheritance model is unknown. We used the exponential model due to its higher robustness when handling pedigrees of different sizes, and scoring function Spairs, as suggested by McPeek,37 when there is no clear disease inheritance model. Taking the family size differences into consideration (ranging from 2 to 23 bits), the family weighting option ‘power: 0.5’ was used as suggested in the Allegro manual. Parametric analysis allowed for heterogeneity assuming 5, 65 and 80% penetrances for homozygous wild type, heterozygotes and homozygosity for the disease allele, respectively, and a disease allele frequency of 5% was also performed using the parametric option in Allegro. Three of the largest families had to be cut in size (younger genotyped persons with unknown disease status were removed) to fit the size limitations of 25 bits in the Allegro program and also to save computer time. Linkage analyses calculations of chromosome 19 were performed both with and without the APOE genotypes.

Simulation analysis under the null hypothesis of no linkage across the whole genome was performed 1000 times with simulated genotypes using the same marker map, allele frequencies and pedigree structure and assuming 7% of missing genotypic data. We developed a grid-aware computer implementation of the Allegro program, by which many thousands of calculations can be executed in parallel and thereby save months of computer time.38 To estimate the empirical genome-wide significance (GWS) level, the three highest obtained LOD scores for the total (N=109 families), and the APOE ɛ4-stratified groups (N=63 and N=46, respectively) were used as threshold levels, which were compared to the number of times these LOD scores were estimated in the simulated data. A GWS probability in which P-value is less than 0.05, in other words an occurrence of a LOD score equal to or higher than a given threshold once per 20 genome scans, was used as the definition of significance, in agreement with Lander and Kruglyak.39

Power calculations on the 109 families were performed using the AllegroSim option assuming heterogeneity with α=30% (ie 30% of families linked to other loci), 7% missing genotypes and the authentic pedigree structures.

Results

We report the results of a follow-up genome scan in Swedish AD families. In this study, we genotyped 1102 microsatellite markers in 486 individuals from 109 AD families with a success rate of 96%, and possible genotyping errors were minimized by checking for Mendelian inconsistencies before starting the linkage analysis. These genotypes were combined with the non-redundant genotypes from our previous study,28 resulting in information from a total of 1289 genotyped markers. All linkage data presented in table format were acquired without the APOE genotypes. Supplementary Figure 1 illustrates the Zlr score curves for all analysed chromosomes. Table 3 presents a summary of the results with the highest obtained non-parametric mpt/spt LOD scores for the full set of families (N=109), the APOE ɛ4-positive (N=63) and ɛ4-negative (N=46) stratified families. Additional results with all mpt LOD scores ≥1 are listed in Supplementary Table 1 and all obtained spt LOD scores ≥1.5 are listed in Supplementary Table 2. The entire set of 109 families generated its highest linkage peak in 19q13.33, with a significant mpt LOD score of 5.05 and a significant spt LOD score of 3.86. The second highest spt LOD score for the 109 families was located in 4q25 and did not reach significance: spt LOD=2.31 (GWS P=0.31), corresponding mpt LOD=0.16 (P=1).

Table 3 A summary of the single highest obtained multipoint (mpt) and the single highest obtained singlepoint (spt) non-parametric LOD scores in the complete family material, the APOE ɛ4-positive and APOE ɛ4-negative families, respectively, are listed with their corresponding genome-wide significant values (P-values)

The highest and most significant mpt LOD score in the APOE ɛ4-positive families was 5.31 (P=0.0011) located at marker D19S903 approximately 0.35 cM from the APOE locus. The only other LOD score that reached GWS in the ɛ4-positive families was obtained in 6p24 spt LOD=3.21 (P=0.044). However, its matching mpt score was low and insignificant (LOD=0.23) as well as the spt LOD scores for the flanking markers. The third highest obtained spt LOD score in this group was 2.31 (P=0.28) at the same peak marker D4S2989 in 4q25 as in the whole set of families. The corresponding mpt LOD score (0.07) in 4q25 was insignificant (P=1) as well as the spt LOD scores for the flanking markers.

None of the obtained linkage peaks reached significance in the subset of 46 APOE ɛ4-negative families. The LOD scores in chromosome 19 were not significantly different when APOE genotypes were included in the analysis, except for the 63 APOE ɛ4-positive families, where the maximum spt LOD score increased from 5.3 to 9.3 at the APOE marker.

Parametric analysis allowing for heterogeneity did not identify any other loci besides APOE, and the number of families linked to APOE was 76% (data not shown).

Discussion

By increasing the number of participating families in the present study from 71 to 109 families, we hoped to increase the genetic information enough to obtain stronger linkage signals in the suggestive linkage regions observed in the original genome scan.28 A little surprisingly, the only significant linkage peak obtained by mpt analysis in the full set of 109 Swedish AD families (mpt LOD=5.05, P=0.015) was still a reflection of the known APOE gene in chromosome 19q13. Sixty-three of the 109 families (58%), in which all affected carried at least one ɛ4 allele, generated a significant mpt LOD of 5.31 at a distance of 0.35 cM centromeric to the APOE gene even in the absence of the APOE genotypes in the analysis. The data suggest that our analysed family material is under a very strong influence of the ɛ4 allele, as supported by both non-parametric and parametric linkage analysis. The AAO was 68.7 years for all families, which is also the age at which the APOE gene has been reported to exert its strongest effect.40 Besides the peak in 19q13.33, the only other LOD score that reached the level of significance was in 6p24 for the subset of 63 APOE ɛ4-positive families, with a spt LOD score of 3.21 (P=0.044) at marker D6S1279. However, the weak mpt LOD score and insignificant spt LOD scores for the flanking markers (Supplementary Table 2) imply that this is most likely a nonsignificant spurious effect. Furthermore, the 6p24 region has not been reported earlier to be linked to AD, although findings of several smaller linkage peaks (LODs between 1.5 and 1.9) in the adjacent chromosomal regions 6p21 and 6q21 have been described.14, 15, 17, 18 In addition to the already discussed highly significant linkage to 19q13.33, the 4q25 region appeared as a suggestive peak both in the original (spt LOD=2.35, P=0.35 in the ɛ4-positive subgroup) and this extended (spt LOD=2.31, P=0.28 in the ɛ4-positive subgroup, and LOD=2.31, P=0.31 in 109 families) genome-wide scan for AD susceptibility genes. Although simulation analyses and the spt LOD scores of flanking markers (Supplementary Table 2) suggest that these LOD scores are not significant, one may be cautious to rule out the possibility of a susceptibility gene in this part of the genome as at least one candidate gene, COL25A1, is located in this region.41, 42 Furthermore, an ongoing association study of the COL25A1 gene indicates that it may contribute to the genetic risk of developing AD (C Forsell et al, manuscript in preparation).

It is also noteworthy that we did not find linkage to the reported linkage peaks in sib-pairs to chromosomes 9, 10 or 12.14, 15, 16, 17, 18, 19, 20, 21 It is unlikely that the reason for this inconsistency is a reflection of the relative contribution of APOE in the different study populations, as the study by Blacker et al18 also had strong evidence of linkage to the APOE gene (LOD 7.7), similar to our study. However, the stage 2 study by Myers et al17 only reported a weak-linkage signal in the APOE region (LOD below 1.6). Furthermore, neither chromosome 17 (granulin, GRN)) nor chromosome 21 (APP) generated any suggestive linkage peaks in our linkage analysis, suggesting that these two genes have no major causative effect in our AD population.

The analysed families were not diagnosed solely with AD, but with a combination of dementias. It is possible that this mixture made the family set too heterogeneous genetically. However, it has also been hypothesized that the same genes may contribute to neurodegeneration in general, but that the different combination of susceptibility genes will result in different phenotypes.43, 44, 45 Furthermore, there are examples of varied clinical phenotypes in family members carrying the same APP duplication,46 GRN mutation47 and in families segregating frontotemporal dementia with motor neuron disease,48 and we clearly have families with variable presenting symptoms as well as neuropathological changes among siblings. Another explanation for the lack of significant linkage might be the complex segregation pattern in some of our AD families. Dementia was present in both paternal and maternal lines for 11 of the families, adding further to the complexity of the involved pathogenic genes. Looking back on the published genetic studies of familial AD, the lack of success in identifying additional genes harbouring causative mutations gives a sense of the large problems and difficulties in determining the aetiology of the disease. The answer may lie in improving the clinical classification of the different sub-phenotypes of dementia, using only autopsy-confirmed AD cases in the linkage analysis as Gordon et al23 did in their suggested ‘gold-standard’ method or by mapping quantitative traits such as measurable biomarkers, for example, protein levels in cerebrospinal fluid or variables obtained by brain imaging using MRI.49 Finally, a whole genome association study design may prove to be a more fruitful approach for identifying additional susceptibility genes in AD. The first reported high-density genome-wide SNP association study of 1086 definitive AD cases showed the strongest association to an SNP located 14 kb from the APOE ɛ4 variant,50 but no other significant loci were repeated. The major obstacle regardless of approach is probably that the number of cases required may be 10-to 100-fold that which has been used in single studies so far.51 Follow-up studies are ongoing in the subset of 24 families with autopsy-confirmed AD and in the 18 families with ‘mixed AD’ as well as in families with shared ancestries.