Introduction

Alzheimer's disease (AD; MIM104300) is the major cause of dementia in the elderly. Its main neuropathological signs are extracellular depositions of amyloid β-protein and intracellular neurofibrillary tangles composed of hyperphosphorylated tau. It is a genetically heterogeneous and complex condition with both sporadic and familial forms. Several mutations in the genes encoding the amyloid precursor protein (APP; MIM+104760), presenilin-1 (PSEN1; MIM+104311) and presenilin-2 (PSEN2; MIM+600759) have been identified in early-onset familial AD in which the inheritance follows a Mendelian autosomal dominant pattern,1, 2 aside from two amyloid precursor protein mutations that are suggested to be recessive.3, 4 However, mutations in these three genes do not explain all familial AD cases and they account for less than 1% of all AD cases. For the much more frequent late-onset form of AD, which shows complex inheritance, only one gene has consistently been associated with a major risk factor for AD. A common polymorphism (ɛ4) in the gene encoding apolipoprotein E (APOE) gives increased risk for late-onset AD and lowers the age at onset in a dose-dependent manner.5, 6 However, two large and independent genome-wide association studies have recently shown evidence for positive associations between late-onset AD and polymorphisms in the CLU, CR1 and PICALM genes.7, 8 The effect on risk for these variants is much weaker compared with APOE ɛ4.

Previous genome-wide family-based linkage analyses have identified several loci linked to AD9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and confirmed linkage by several groups has been reported to chromosome (chr.) 9p, 9q, 10q, 12p and 19q.20, 21, 22, 23, 24 Extensive follow-up and fine mapping studies have been performed in these linked genetic regions without identifying a new AD gene.25, 26, 27, 28, 29, 30, 31, 32, 33 A summary of all reported genetic linkage and association studies for AD is available in the AlzGene database.34

For many years it has been well documented that hypertension and high levels of blood cholesterol in midlife are associated with an increased risk of developing stroke and cognitive impairments.35, 36, 37, 38 The epidemiological studies for identifying vascular risk factors for AD are numerous (see the review by Cechetto et al.39), and a large effort is presently directed at identifying the genetics behind these risk factors. During the previous year, large-scale genome-wide association studies have reported positive association with NINJ2, a susceptibility gene for stroke,40 with loci influencing lipid levels and coronary heart disease,41 and with several genes influencing blood pressure.42, 43, 44

Few genetic studies have been reported and no common susceptibility gene has yet been identified for multi-infarction dementia, or vascular dementia (VaD), which is the second most common type of dementia after AD.45, 46, 47 There are contradicting results regarding the effect of APOE ɛ4 on the risk for VaD.48, 49, 50, 51, 52

Cerebral amyloid angiopathy, a common feature of AD, is caused by progressive deposition of amyloid in cerebral and leptomeningeal blood vessels, with subsequent degenerative vascular changes that often result in cerebral hemorrhage, ischemic lesions and dementia.

Interestingly, the first amyloid precursor protein mutation identified caused hereditary cerebral hemorrhage with amyloidosis in a Dutch family.53 The Flemish and Iowa amyloid precursor protein mutations also cause severe cerebral amyloid angiopathy, whereas the classical AD mutations lead to less severe forms of cerebral amyloid angiopathy.54

CADASIL (cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy) is a rare familial form of VaD, showing an autosomal dominant pattern of inheritance, with onset around the age of 45 years.55 There is often a history of migraine, and many patients later develop subcortical dementia and pseudobulbar palsy. CADASIL is mapped to 19p13.2-p13.1 and mutations have been identified in the Notch3 gene.55, 56

In 2008, we published the results of an extended genome-wide scan, henceforth referred to as the GS2 study, in which we genotyped 292 affected and 176 unaffected family members from 109 pedigrees segregating AD.23 Linkage analysis of these 109 families resulted in a significant multipoint logarithm of the odds (mpt LOD) score of 5.05 to chr. 19q13, interpreted as being attributed by the APOE gene located in the linkage region. In this paper, we present the results from linkage analysis of a subset of the 109 AD pedigrees where the families, in addition to having cases with clinical AD, also have family members with VaD and/or ‘mixed AD’, suggesting that the families have vascular components in their disease etiology. We have fine mapped the maximum LOD-2 (LODmax-2) interval under the linkage peak in two steps: first using microsatellite markers and then by single-nucleotide polymorphisms (SNPs) and subsequently performing both linkage analysis and family-based association analysis of candidate genes.

Materials and methods

Family material

Blood samples were collected after obtaining informed consent from participating individuals or next of kin, and the study was approved by the local Ethics Committee at the Karolinska Institutet, Huddinge, Sweden. DNA was extracted according to standard procedures. Genealogical studies using parish records dating back to the 1600s have been performed in an attempt to establish ancestral relationships between our sampled AD/VaD families, thus yielding the current genealogical database comprising 9000 persons. In all, 18 of the 109 families included in the GS2 study contained cases with a clinical diagnosis of AD, as well as cases with a VaD diagnosis (including stroke with dementia, multi-infarction dementia and sub-cortical VaD, and cases with Hachinski ischemic score57 >4), cases with a ‘mixed dementia’ diagnosis or cases where the neuropathological examination revealed multiple infarctions. Two pedigrees with clinical description of all sampled cases are included in the Supplementary material (Supplementary Figure 1 and legend). These 18 families are from this point investigated together as a subset in the linkage analyses. The 18 families showed a history of AD spanning 2–4 generations back and comprised 95 genotyped individuals, including 55 affected subjects and 40 individuals (20 males/20 females) with unaffected phenotype. Of these 55 cases (range 1–6 subjects per family; mean 3.1), 33 were females. The mean age at onset was 68.7 (range 54–81) years and 65% of the cases carried at least one APOE ɛ4 allele.

By examining medical records and updating the family material, we added 12 new families, making a total of 30 families with mixed AD/VaD. In all, 15 of the cases were examined neuropathologically, resulting in 14 cases with possible, probable or definitive AD and one case with VaD.

Genotyping

In a 6-cM-long linked genetic region defined by the LODmax-1 interval spanning from the 20p telomere to D20S113, six new microsatellite markers (D20S1155, D20S1157, D20S1156, D20S103, D20S199 and D20S906) were selected for fine mapping (FM1) to create a higher resolution. DNA samples from 93 individuals from 18 families were genotyped at deCODE Genetics, Reykjavik, Iceland.

DNA samples from 228 persons from the updated 30 families were selected to be further genotyped in the second fine-mapping approach (FM2) on the basis of the SNPs in selected genes in the LODmax-2 defined interval, now spanning 5.81 cM, which is equivalent to 1.45 Mb from the p-telomere to marker D20S906. The LODmax-2 defined interval contained 34 transcripts according to NCBI Map viewer (Build 36.2). In all, 18 of these transcripts were selected (Supplementary Table 1) for targeted genotyping using a combination of tagSNPs and non-synonymous SNPs. Transcripts were selected manually according to either (1) known gene function that could conceivably be involved in AD or (2) predicted genes with uncharacterized function. TagSNPs with minor allele frequency >0.10 were identified by the Tagger algorithm (r2>0.8) as implemented by the HaploView v.4.1 software58 for all selected transcripts using HapMap phase 2 data.59 Non-synonymous SNPs with minor allele frequency >0.05 or average heterozygosity >0.15 were identified in the UCSC Genome Browser Assembly (hg18) and the NCBI SNP databases (dbSNP Build 127). In total, 158 tagSNPs and 9 non-synonymous SNPs were selected in 18 transcripts (Supplementary Table 2). Assays for all 167 SNPs were designed for single base-pair extension chemistry and analyzed on a Sequenom MassArray Autoflex (Sequenom, San Diego, CA, USA) at the Mutation Analysis Facility, Karolinska Institutet (Huddinge, Sweden). SNPs with a genotyping success rate less than 75% were discarded, leaving 141 SNPs for further analysis. All SNPs were in Hardy–Weinberg equilibrium (P-value >0.001), which is the threshold level used by the HapMap project.59

Statistical analysis

Linkage calculations were performed using Allegro v.1.2c60 with the scoring function ‘spairs’ using both a non-parametric model and a parametric model assuming dominant inheritance, and allowing for genetic heterogeneity between families (resulting in heterogeneity LODs).60 In the parametric model a disease allele frequency of 5% was used, and penetrances of 5, 65 and 80% for wild-type homozygotes, heterozygotes and disease allele homozygotes, respectively. Allele-sharing mpt LOD scores, as well as single-point (spt) LOD scores were calculated according to the exponential model. Non-parametric linkage (NPL) statistics were also obtained by Allegro. When combining the family scores to obtain an overall score, we used a compromise between weighting the families equally and weighting the affected pairs equally, using the ‘power:0.5’ option, avoiding a bias due to substantial differences in family sizes. Simulation analysis under the null hypothesis of no linkage across the whole genome was performed 1000 times for the 18 families with simulated genotypes using the same marker map, allele frequencies, pedigree structure and percentage of missing data (7%) as in the GS2 linkage analysis.23 We developed a grid-based software implementation of the Allegro program, by which many thousands of calculations can be executed in parallel and can thereby save months of computational time.61 To estimate the empirical genome-wide significance level, the three highest obtained LOD scores for the set of 18 families were used as threshold levels and were compared with the number of times these LOD scores, or higher, were estimated in the simulated data. A genome-wide significance probability P-value of less than 0.05, or, in other words, occurrence of an LOD score equal to or higher than a given threshold once per 20 genome scans, was used as the definition of significance, in agreement with Lander and Kruglyak.62 The LODmax-1 linkage region is commonly taken as a crude estimate of the 95% confidence interval for the location of a disease-causing gene, assuming that the linkage reflects the presence of a single susceptibility locus.63 We chose to use the more conservative LODmax-2-defined interval in the second step of fine mapping to minimize the risk of excluding any region of interest.

We pooled data from the previously genotyped 44 microsatellites in chr. 20 (FM1) with the new SNP genotypes, making a total of 234 genotyped individuals: 96 affected and 138 unaffected family members (FM2). We implemented the cluster option in the Merlin genetic software64 to identify independent SNPs (r2>0.3), and linkage analysis was thereafter performed using Allegro v.1.2c, as described above. As a result, 69 of the 141 successfully genotyped SNPs were included in the linkage analysis (Supplementary Table 2). Inter-marker distances in cM regarding the SNPs were estimated.

The family-based association of disease traits procedure of PLINK v.1.06 was used.65 This procedure decomposes the data set into nuclear families and implements a sibling transmission-disequilibrium test.66 To ensure that the χ2 statistics were valid, we only considered results from SNPs for which PLINK calculated that at least 10 alleles were expected among the affected individuals. In addition, the data were analyzed using FBAT 2.0.2 (http://www.biostat.harvard.edu/~fbat/), including hbat analysis of SNPs within known linkage disequilibrium -blocks and PBAT 3.6 (data not shown).67, 68

Results

A total of 95 individuals (55 affected and 40 unaffected) from 18 families with mixed AD/VaD were successfully genotyped for a total of 1289 markers evenly distributed on chr. 1-22 and X. The maximum obtained non-parametric linkage score (spt) genome-wide was 3.09 in chr. 20p, with a significant genome-wide significance P-value of 6.5 × 10−4 (Table 1). The highest non-parametric LOD score obtained, genome-wide, was 2.48 (mpt) in 20p at marker D20S117, with an spt LODmax score of 2.97 at the same marker. The three highest genome-wide LOD scores are presented in Table 1; however, neither LOD score reached the level of genome-wide significance based on simulation analysis. No observed LOD score, including the APOE locus in chr. 19, reached the level of genome-wide significance. However, the significant spt non-parametric linkage score on 20p convinced us to continue and further examine the genetic region with a higher density of markers.

Table 1 The three highest obtained genome-wide non-parametric LOD scores in 18 families with a mixed phenotype of AD/VaD, with corresponding results from the simulation analysis

Six microsatellite markers distributed along a 6-cM-long region, defined by LODmax-1, were successfully genotyped in FM1 in 85 DNA samples: 51 affected and 34 unaffected from 18 families (Table 2). There was only a small difference in the obtained non-parametric mpt LODmax scores, of 2.54 at D20S1156 (with six added markers) compared with 2.48 at D20S117 before fine mapping. The mpt linkage peak moved closer to the p-telomere, with the additional information from four markers located distal of D20S117. The spt LOD score peak was still at marker D20S117, with an LOD score of 2.99.

Table 2 Maximum non-parametric LOD scores (MLS) from linkage analyses on chr. 20 data from the genome-wide scan (GS2; 38 markers), fine mapping with microsatellites (FM1; 44 markers) and extended analysis of microsatellites together with SNPs (FM2; 113 markers)

In the second stage of fine mapping involving 30 families, nine samples (3.6%) had SNP genotyping success rates less than 85% and were discarded, leaving 219 persons for further association analysis. A total of 29 424 genotypes from 141 SNPs were obtained, corresponding to an average success rate of 95.3%. When data from microsatellite markers and SNPs in linkage equilibrium from 30 families with mixed AD/VaD were combined (FM2), the highest obtained non-parametric LOD score was 1.79 (mpt) at rs2144151 and 2.65 (spt) at rs1014897 (Table 2). Both markers are intronic SNPs in the angiopoietin-4 (ANGPT4) gene. This was an observed drop in mpt LODmax from 2.54, when using only microsatellite marker data from 18 families. The analyzed LODmax-2 region is schematically drawn in Figure 1. The linkage results, when including only the originally analyzed 18 families with mixed AD/VaD, showed a higher degree of linkage compared with the full set of 30 families, with an LODmax of 2.92 (mpt) at marker rs2008022 and 2.56 (spt) at D20S117. The mpt parametric heterogeneity LODs indicate that the fraction of unlinked families (α) was 38% for the 30 families (data not shown). However, a separate linkage analysis of the FM2 genotypes was performed on the 18 original families, generating a parametric heterogeneity LODmax of 2.82 at rs2008022 with no indication of heterogeneity (α=0%).

Figure 1
figure 1

Schematic map of the fine-mapped region in 20p13, defined by LODmax-2 and spanning 1.45 Mb. SNPs were selected in 18 candidate transcripts. The maximum observed non-parametric LOD score and the peak location are shown for FM2 (combined microsatellites and SNPs). Each SNP with P0.05 in the PLINK analysis is marked with an asterisk at its corresponding gene.

In the familial association analysis using PLINK, nine SNPs in seven genes obtained P-values 0.05 (Table 3). The lowest P-value recorded (P=0.002) was for rs6051900, which is an SNP located in the RanBP-type and C3HC4-type zinc finger-containing 1 gene (RBCK1). Another marker we would like to mention is rs6055803 (P=0.017), an SNP in the ANGPT4 gene. Two transcripts (RBCK1 and C20orf54) contained two SNPs each with P0.05, and in both cases there was linkage disequilibrium between the two intragenic markers in our data. No SNP was significantly associated after Bonferroni correction. In addition, the data were analyzed using FBAT and PBAT, which in general was in good agreement with the PLINK analysis (data not shown).

Table 3 SNPs with P-values 0.05 in the familial association analysis by PLINK

Genealogical studies showed that seven of the families share a common ancestor with at least one other analyzed family, but no evidence was found regarding a common forefather for all 30 AD/VaD pedigrees, although many of the families originate from the same county in northern Sweden. By inspecting the haplotypes from GS2, as well as from the microsatellite FM1 data, there were no signs of shared haplotypes between the families in the 20p13 region (data not shown).

Discussion

From a genetic point of view, AD is a typical complex disease with multiple genetic and environmental factors contributing to the disease etiology, which makes family-based linkage and association studies less powerful. We selected families with a clinical diagnosis of AD, as well as vascular or mixed dementia. Using these inclusion criteria we first identified 18 families from the set of 109 AD families in our previously published GS2.23 A genome-wide linkage analysis on these 18 families (maximum spt non-parametric linkage score of 3.1) prompted us to fine map a linked region in 20p using both microsatellites and SNPs. By re-examining medical records and collecting novel families, the number of families included in the SNP-based fine mapping was increased from 18 to 30. A total of 18 transcripts were selected and SNPs were genotyped for both linkage and family-based association analysis. Linkage analysis with data obtained at this stage did, however, result in a decrease in the LOD score, which could be due to the fact that the 12 additional families had a more pronounced vascular phenotype. This could explain both the higher degree of genetic heterogeneity and the lower obtained LOD scores when looking at the full set of 30 families compared with the original 18 families.

The family-based association test is based on the detection of transmission distortion of a particular allele to affected members in nuclear families. A major advantage of this type of test in comparison with population-based association is the robustness against population stratification.69 A disadvantage is that it requires genetic variation to be transmitted within a nuclear family; that is, at least one parent needs to be heterozygote. Data from homozygote parents will not contribute to the test statistics and as SNPs in general have low levels of heterozygosity this will reduce the statistical power. This loss of information can be extensive and thus it is generally accepted that a population-based association study without any population stratification is more powerful than a family-based association study.70 In agreement with this low power, the nine SNPs with P0.05 identified did not survive Bonferroni's correction for multiple testing.

In our study there was no overlap between the SNPs with the lowest PLINK P-values and the SNPs generating the highest LOD scores. However, overlapping result was observed for the ANGPT4 gene, with linkage peaks at SNP rs2144151 (mpt) and rs1014897 (spt), as well as association with PLINK at rs6055803 (P=0.017). The three SNPs are located within 11.8 kb of each other, but are not in linkage disequilibrium in our material or according to HapMap (for physical positions see Supplementary Table 2).

A strong candidate gene for mixed AD/VaD in this region is ANGPT4, for which we also obtained the highest LOD score after refined fine mapping and which was the only gene that was among the top hits in both linkage and association studies. Angiopoietins have important roles in vascular development and angiogenesis, and disturbances in this process can result in a range of diseases, such as age-related blindness, cancer and stroke.71, 72 Chapuis et al.73 have shown, by performing a transcriptomal analysis of brains followed by SNP genotyping in the differentially expressed genes, that six SNPs in ANGPT4 were associated with AD. A 65% difference in gene expression was found between the brains of AD and controls. None of these SNPs were included in our study and vice versa. All angiopoietins bind with similar affinity to the Tie2 receptor, an endothelial cell-specific tyrosine-protein kinase receptor. Both human ANGPT1 and 4 function as Tie2 activators, whereas ANGPT2 functions as a natural inhibitor of ANGPT1 and blocks Tie2 signaling.74 The expression pattern and the exact role of ANGPT4 in angiogenesis are still poorly characterized.75 In terms of angiogenic function, it has been shown that endothelial cells from older individuals and animals show impairments in cell proliferation, migration, tube formation and sprouting, compared with those of younger counterparts.76, 77, 78

Among the SNPs with P0.05 in the PLINK analysis was rs6051900, located in the RBCK1 gene. This SNP obtained the lowest P-value (0.002) in the PLINK analysis. The encoded RBCK1 protein has been identified as an E3 ubiquitin ligase involved in protein regulation by protein degradation through the ubiquitin–proteasome pathway.79 Impaired ubiquitination is known to be involved in neurodegenerative diseases; for instance, mutations in the E3 ligase PARK2 cause autosomal recessive juvenile Parkinsonism (MIM600161).80 Aberrant protein degradation through the ubiquitin–proteasome pathway has also been suggested to be one of the etiologies in AD.81

Previous AD studies have found weak association with SNPs in TRIB3,82, 83 a negative regulator of the inflammation response inducing gene NF-κB,84 located in our linkage interval. We analyzed SNPs in TRIB3 in the present association analysis and our lowest obtained P-value was 0.10. The lack of concordance with the previous studies may be explained by the fact that those studies were case–control-based, with a clinical diagnosis of AD, in which vascular components had not been actively included and were possibly even excluded.

This is to our knowledge the first report on a linkage analysis of families with mixed AD and VaD. Previous linkage studies have mainly focused on ‘pure’ AD or on stroke alone. In the clinical setting, vascular changes are often seen together with AD symptomatology.85 There is no previous report of linkage to 20p13 for AD, VaD or stroke. Association (P=0.009) has, however, been reported to rs6107516 in the prion protein (PRNP) gene in a genome-wide association study of 753 sporadic AD cases.83 The PRNP gene is located at position 4.6 Mb in chr. 20, just outside our fine-mapped LODmax-2 interval (position 0–1.45 Mb). Mutations in PRNP result in Creutzfeldt–Jakob disease, Gerstmann–Sträussler–Scheinker syndrome and fatal familial insomnia, and a few PRNP mutations have been identified in familial AD patients.86 In both AD and prion disease, neurodegeneration is accompanied by cerebral deposits of amyloid and aggregated tau neurofibrils, making PRNP a good candidate gene for AD.87 We have sequenced the translated region of PRNP in cases from our 20p13 linked families, but no mutation was found (data not shown); therefore, it is unlikely that PRNP would be responsible for the phenotypes observed in our sample set.

One form of cerebral amyloid angiopathy leading to intracranial hemorrhage, dementia and often death before the fourth decade is caused by mutations in the cystatin C gene (CST3) located in chr. 20. CST3 is located at 23 Mb, which is far outside of our linked interval and is, therefore, not considered to be the AD-associated gene in our material.

By selecting families with a diagnosis of mixed AD/VaD, we identified a total data set of 30 families. The lack of significant evidence supporting a novel locus in the current study may indicate that the initial linkage to D20S177 was a type I error, especially as it has been shown that intervals without flanking markers (as at telomeres) can have an increase in the rate of false-positive linkage.88 With the increasing number of families in the FM2 stage, one might expect a higher LOD score than in the initial analysis. Although not reaching statistical significance, our result identifies ANGPT4 as a strong candidate gene for AD/mixed AD/VaD in our family material. Confirming this in other populations will, of course, be necessary, but the biological function of ANGPT4 fits very well with a role in the disease pathophysiology of mixed AD/VaD.

Electronic-database information

Alzgene:http://www.alzforum.org/res/com/gen/alzgene/default.asp

Hapmap:http://hapmap.ncbi.nlm.nih.gov/

NCBI Home Page:http://www.ncbi.nlm.nih.gov/

UCSC:http://genome.ucsc.edu/cgi-bin/hgGateway