Introduction

Atherosclerosis is the primary cause of coronary heart disease (CHD) and stroke, accounting for approximately 50% of all deaths in Western societies.1 Plasma high-density cholesterol (HDL-C) levels are inversely related with the risk of CHD,2 and low plasma HDL-C is the most common dyslipidemia associated with premature and familial CHD.3 Genetic factors account for approximately 50% of the variability in plasma HDL-C levels, and the inheritance of HDL-C levels is complex, being influenced by multiple genetic and environmental factors and by interactions between these factors.4 So far, genome-wide scans for the loci regulating HDL-C levels have been published with significant results reported with respect to 18 chromosomes (linkage studies are summarized in Table 1a 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 and genome-wide association studies in Table 1b,26, 27, 28, 29, 30, 31, 32) which is indicative of high genetic heterogeneity in HDL-C regulation. This genetic heterogeneity also involves the general link between low plasma HDL-C levels and increased CHD risk, as some genetic variants are associated with low HDL-C levels and decreased CHD risk (eg, apoA-I (Milano))33 and some with high HDL-C levels and increased CHD risk (eg, a hepatic lipase gene variant).34 To reduce the heterogeneity, we collected the sample based on probands with low plasma HDL-C levels and premature CHD, and other CHD risk factors were reduced by allowing only normal or moderately elevated levels of triglycerides (TG) and total cholesterol and no individuals with diabetes in the probands. The Finns originate from small founder populations sharing a relatively homogeneous gene pool and environment,35 both of which are helpful for the mapping of susceptibility genes.

Table 1a Summary of the loci for HDL-C in published genome-wide linkage scans
Table 1b Summary of the loci for HDL-C in published genome-wide association studies

The aim of this study was to search the whole genome for loci regulating plasma HDL-C levels in the population of Northern Finland, where there is a high prevalence of CHD. On the basis of the close relationship between CHD and low HDL-C, identification of the genes affecting HDL-C levels could lead to the discovery of major genetic factors involved in atherosclerosis susceptibility. To elucidate the genetics of the low HDL-C-trait associated with CHD, linkage analyses for quantitative trait loci (QTLs) regulating plasma HDL-C levels and loci linked to the qualitative low HDL-C trait were performed in 35 Finnish extended pedigrees with premature CHD and low HDL-C levels (644 individuals, 375 genotyped).

Subjects and methods

Subjects studied

Probands with premature CHD (ie, acute myocardial infarction, a coronary artery bypass graft operation or percutaneous transluminal coronary angioplasty before the age of 55 years) were selected from the records of Oulu University Hospital. In addition, the probands were required to have low HDL-C levels (<1.1 mmol/l) and normal to moderately elevated levels of TG (<3.5 mmol/l) and total cholesterol (<7 mmol/l), no diabetes and an entry in the hospital records indicating a family history of CHD. In addition, all the relatives of the proband (independent of their CHD status and HDL-C levels) who were willing to participate were examined and extended pedigrees were recruited to ascertain the genetic background. The total number of family members was 644 (with 375 subjects genotyped). There were 35 pedigrees with three generations on average (minimum 2 and maximum 5) and with an average pedigrees size of 19 subjects (minimum 5 and maximum 89). Blood samples for DNA extraction and lipid measurement were obtained from each subject, and lipid measurements were performed in the case of the CHD patients before or at least 3 months after myocardial infarction or coronary bypass operation. Information about medication, past medical history and smoking (recoded as current smoker or not and in pack-years of smoking and the number of cigarettes smoked currently) was elicited using a questionnaire. Altogether, 40 of the subjects were receiving statin therapy. The daily doses of statin taken by our subjects (collected between 1993 and 2000, before the era of intensive statin treatment) were small (simvastatin 10 mg/day 20 subjects, simvastatin 20–30 mg/day 3 subjects, lovastatin max 40 mg/day 6 subjects, fluvastatin max 40 mg/day 5 subjects, atorvastatin max 20 mg/day 3 subjects and pravastatin max 20 mg/day 3 subjects), and their effect on plasma HDL-C levels was expected to be only modest (an increase of <6%),36 so that these subjects were not excluded from the analysis. However, to control the statin use as a possible confounding factor, the chromosomes revealing an LOD/NPL score >1.5 were re-analyzed with a constant (6% of the mean HDL-C levels: 0.06 mmol for men and 0.09 mmol for women) subtracted from HDL-C values of the statin users.37 The study was approved by the ethical committee of the University of Oulu, and all the subjects gave their written informed consent.

Lipid and lipoprotein measurements

Blood samples were obtained after an overnight fast. Plasma lipoprotein fraction were separated using sequential ultracentrifugation and concentrations of cholesterol and TG in the plasma and lipoprotein fractions were determined using enzymatic colorimetric methods, as described in detail earlier.38

DNA analysis

The ABI Prism Linkage Mapping Set 1 (Applied Biosystems, Foster City, CA, USA) was used for genotyping microsatellite markers in the initial screening. The set consisted of 358 markers covering the whole genome (except the Y chromosome) at about 10 cM resolution. The markers were originally selected from the 1996 Genethon map.39 Additional markers on chromosomes 2, 4, 6, 10, 15 and 22 were selected for the regions showing highest evidence of linkage. Genomic DNA extraction and the microsatellite marker genotyping with the ABI 377 and ABI 3100 automatic sequencers were performed as described earlier40 and recommended by the manufacturer (Applied Biosystems). The sizes of the alleles were determined using the Genescan (Version 3.1), Genotyper (2.0) (with ABI 377) and GeneMapper (3.5) (with ABI 3100) programs. A total of eight SNPs (shown in Table 4) on two functional candidate genes, peroxisome proliferator activated receptor delta (PPARD) and retinoid X receptor beta (RXRB), were selected from the HapMap to cover all the haploblocks of the genes. The genes were located on chromosome 6 near the only region showing suggestive evidence of linkage in both qualitative and quantitative linkage analysis. The SNPs were genotyped using TaqMan SNP Genotyping Assays and the Applied Biosystems 7000 Real-Time PCR System (Applied Biosystems).

Statistical analysis

Pedigree integrity was checked by GRR (graphical representation of relationships)41 using IBS allele sharing to detect pedigree errors, as pairs of different classes of relatives and non-relatives can be characterized by the unique distribution of allele shared across the genome. Mendelian errors were checked with the PedCheck program.42

Quantitative linkage analysis

A multipoint variance components linkage analysis was used to test linkage between the marker loci and HDL-C, which was based on specifying the expected genetic covariances between pairs of relatives as a function of their identity by descent (IBD) at a marker linked to a QTL.43 The total observed phenotypic variance was split into components attributable to QTL, residual polygenic effects and non-genetic effects.

The two-point and multipoint analyses were performed with the SOLAR (Sequential Oligogenic Linkage Analysis Routines) program,44 version 4.0.7 (Copyright © 1995–2009, Southwest Foundation for Biomedical Research, http://solar.sfbrgenetics.org/), using variance components analysis for extended pedigrees. Multipoint identity-by-descent matrices were estimated using Markov chain Monte Carlo methods implemented in SimWalk2 http://www.genetics.ucla.edu/software/simwalk45 (with direct SOLAR support). The presence of a putative QTL was tested by means of a likelihood ratio statistic (LOD score). Plasma HDL-C levels were used as a continuous variable in the analysis, and the sex, age and body mass indices of the subjects (BMI, only when stated), being statistically significant covariates in the data set (P<0.001), were also considered in the analysis. Ascertainment correction by conditioning for the probands was performed. To deal with skewness and kurtosis, log transformation and LOD adjustment were used, the latter using a simulation to build up the distribution of LOD scores that one could expect to observe under the null hypothesis of no linkage. This consisted of 10 000 trials, in each of which a fully informative marker completely unlinked to the trait was simulated and trait linkage was then tested at that simulated marker. The LOD adjustment regresses the observed LOD scores against the LOD scores expected for a multivariate normal trait. The inverse of the slope of the regression line is the LOD adjustment. Empirical P-values for the LOD scores in the sample were also determined using the pedigree data by simulation of 10 000 replicates of a fully informative marker completely unlinked to the trait.

Qualitative linkage analysis

For the qualitative non-parametric linkage (NPL) analysis, subjects having their measured HDL-C levels in the lowest 10th percentile of the sex-specific population HDL-C levels (distributions taken from the control cohort of the OPERA study38) were coded as being affected. This limit was 1.08 mmol/l for women and 0.864 mmol/l for men. Those with their measured plasma HDL-C levels over the lowest 10th percentile were coded as being unaffected. The sample for the qualitative analysis included only those pedigrees with at least two subjects affected according to the low HDL-C criterion (19 pedigrees, 388 individuals, 241 genotyped).

NPL analysis, also known as allele sharing statistics, is based on IBD measurements at the marker loci and is independent of specific models for the inheritance of the trait phenotype. If a marker is linked to a disease locus, one expects to see among the affected cases a clustering of a few marker alleles descended from the pedigree founders. The multipoint NPL analysis was carried out in this study using the SimWalk2 (version 2.83)45 and Merlin (version 0.9.12b, http://www.sph.umich.edu/csg/abecasis/MERLIN/)46 programs. The allele frequencies for each marker were estimated from all individuals by the Downfreq program.47 Mega248 was used to construct all the input files for the SimWalk2 and Merlin programs. An exact analysis using the Lander–Green algorithm was performed by Merlin on the pedigrees of small or intermediate size (9 pedigrees) and the more complex pedigrees (10 pedigrees) were analyzed by SimWalk2 using Markov chain Monte Carlo and simulated annealing algorithms. SimWalk2 was used to combine the pre-computed scores for the smaller pedigrees with the estimates obtained for the large pedigrees and then to compute empirical P-values both for individual pedigrees and for the overall data set.

SimWalk2 presents the results of five statistics (BLOCKS, MAX-TREE, ENTROPY, NPL_PAIR and NPL_ALL) as empirical P-values and −log10(P-value) (NPL score), in which each statistic measures the degree of clustering of the founder alleles among the affected cases. BLOCKS is most powerful at detecting linkage to a recessive trait, MAX-TREE is the most powerful at detecting linkage to a dominant trait, ENTROPY is a measure of the entropy of the alleles among the affected cases, and NPL_PAIR and NPL_ALL are most powerful at detecting linkage to an additive trait and also the two most commonly used statistics incorporated in the most widely used software packages.49 NPL_ALL (used in the results when not otherwise reported) is a measure of whether a few founder alleles are over-represented in the affected cases and corresponds to the NPL-all statistic used in GeneHunter, Allegro, Mendel and Merlin, for example.

Association analysis

QTL association was tested with the Mendel program using variance component models, treating genotypes at the marker locus as predictors modifying the mean for a quantitative trait (QTL association, model 1). This ‘measured genotype’ approach controls for random environment and polygenic backgrounds while remaining in the frequentist domain of maximum likelihood estimation and likelihood ratio tests.50

All the calculations were performed on the computers of the Center for Scientific Computing, Espoo, Finland.

Results

The characteristics of the whole sample and the subset of subjects used in the qualitative analysis are shown in Table 2. The only statistically significant difference between these study samples was in HDL-C levels (P<0.05), indicating the inclusion criteria for qualitative linkage analysis in which each family had to have at least two affected (low HDL-C phenotype) family members. CHD was diagnosed in 41% of our study subjects (56% of the men and 21% of the women in the whole sample), and their relatively young age at CHD onset (49 and 53 years, respectively) emphasizes the potentially strong genetic component in CHD in our pedigrees. The mean plasma HDL-C levels of the sample (1.07 mmol/l for men and 1.39 mmol/l for women) were lower than the levels reported for Northern-Finnish population (for men, 1.22 and for women, 1.56).38 CHD patients had lower plasma HDL-C levels than the healthy subjects among both the men (0.96 vs 1.15 mmol/l, P<0.001) and the women (1.27 vs 1.43 mmol/l, P=0.06), suggesting the importance of HDL as a risk factor for CHD. Only 40 study subjects (ie 11 % of the whole sample) used statins, as the sample was collected before the ‘statin era’. The studied subjects were slightly overweight according to their average BMI (men 26.9, SD 3.9, women 25.5, SD 4.7).

Table 2 Characteristics of all the genotyped subjects (quantitative analysis sample) and of the subset of the subjects used for the analysis of low HDL-C trait (qualitative analysis sample)

Quantitative linkage analysis

The estimated additive genetic heritability of plasma HDL-C was 43% and the proportion of variance due to covariates (sex and age) was 23%. On the basis of the empirical P-value estimate, if the hypothesis of no linkage was true, we would expect 4 out of 10 000 observed LOD scores to be ⩾3.1, giving us an empirical P-value of 0.0004 (unadjusted) for the LOD score of 3.1 in our sample. With 10 000 simulations, we found LOD scores ⩾2.0 nine times (pempirical=0.0009) and LOD scores ⩾1.9 12 times (pempirical=0.0012). To determine the level of suggestive evidence of linkage, we compared the unadjusted empirical P-values from the simulations with the point-wise levels calculated using the equation presented in the article of Lander and Kruglyak.51 The average crossover rate, given the types of relatives in our pedigrees, was 2.94. According to this, our threshold for suggestive evidence of linkage is P=0.0011 and when we compare it with the results of the simulations, we can conclude that an LOD score of 2.0 may be considered as suggestive evidence of linkage in our study.

The initial QTL scan, carried out on 35 pedigrees with sex and age as covariates, yielded multipoint LOD scores (MLOD) of over 1 on chromosomes 2 (MLOD of 1.6: peak between the markers D2S364 and D2S325), 4 (2.4: D4S405 and D4S428), 6 (1.5: D6S289 and D6S276), 15 (1.3: D15S117 and D15S153) and 17 (2.0: D17S784). In the second stage, new markers were genotyped (Figure 1a), and chromosomes 2, 4, 6 and 17 exceeded the level of suggestive evidence of linkage (LOD scores ⩾2.0) at chromosomal locations 2q33 (MLOD 2.1), 4p12 (two-point LOD 3.1, MLOD 2.6), 6p24 (two-point LOD 2.1) and 17q25 (MLOD 2.0) (Figure 2), whereas LOD score for chromosome 15 (MLOD 1.9) was narrowly under the level of suggestive linkage. Chromosomes with LOD scores >2 were also analyzed with all the significant covariates (sex, age and BMI). Inclusion of BMI did not alter the results, except in the case of chromosome 6, in which the region 6p24 yielded a two-point LOD score of 2.7. In conclusion, chromosomal regions 2q33, 4p12, 6p24 and 17q25 showed suggestive evidence of linkage in the quantitative linkage analysis, the linkage on 4p12 being close to the level that is generally considered as significant evidence of linkage. The results are also summarized in Table 3.

Figure 1
figure 1figure 1

(a) Results of the genome-wide quantitative analysis. All genotyped markers were included in the analysis. The x axis indicates the distance (cM) from the p-terminus and the y axis indicates the LOD score. Chr, chromosome. (b) Results of the genome-wide qualitative analysis. All the statistics calculated before adding markers to the regions with LOD score ⩾1.5 are presented in the figure. The x axis indicates the distance (cM) from the p-terminus and the y axis indicates the NPL score. Chr, chromosome.

Figure 2
figure 2

Results showing suggestive evidence of linkage (LOD/NPL score ⩾2.0) in the case of six chromosomes in quantitative analysis (HDL-C as a continuous variable, age and sex as covariates) or in qualitative analysis (subjects having their measured HDL-C levels in the lowest 10th percentile for the general population were coded as affected). Only the results of the MAX-TREE statistics (most powerful at detecting linkage to a dominant trait) or NPL_ALL scores (most powerful at detecting linkage to an additive trait) are reported here for the qualitative analysis, but the results of the other statistics were consistent with these assessments. The x axis indicates the distance (cM) from the first genotyped marker and the y axis indicates the LOD/NPL score. Chr, chromosome. , quantitative multipoint analysis; , qualitative multipoint analysis, NPL_ALL; The same analysis in the initial scan (no additional markers); qualitative multipoint analysis, MAX-TREE; , qualitative multipoint analysis with overweight (BMI>25) subjects with low HDL-C coded as affected, NPL-ALL; , quantitative two-point analysis with sex, age and BMI (for Chr6) as covariates; , qualitative two-point analysis.

Table 3 Highest LOD scores in the QTL analysis and NPL scores in the nonparametric multipoint linkage analysis for the qualitative low HDL-C trait

Qualitative linkage analysis

The first stage of qualitative multipoint NPL analysis (Figure 1b) revealed one suggestive locus with an NPL score (−log10P-value) of 2.6 (P=0.003) at chromosomal locus 10p15.3 (close to the marker D10S249). For the regions in which the scores were >1.5, we selected nine new markers: five on chromosome 10, two on chromosome 6 and two on chromosome 22. In addition, four new families with 27 genotyped persons were added to the analysis. Thus, genotyping of the additional markers and families resulted in suggestive NPL scores of 2.1 on chromosome 6p12, 2.3 on 10p15 and 2.5 on 22q11 (Figure 2). The results are also summarized in Table 3.

Further analysis

As a post hoc analysis, to control for the confounding effect of statin use, all the chromosomes with LOD/NPL scores >1.5 were re-analyzed after subtracting a constant (6% of the sex-specific mean) from the HDL-C values of all statin users. The new results of the quantitative analysis differed somewhat from the previous results, with the new LOD scores being 1.9 for chromosome 2 (2.1 reported previously), 2.9 for chromosome 4 (3.1) 2.8 for chromosome 6 (2.7), 2.4 for chromosome 15 (1.9) and 1.5 for chromosome 17 (2.0). The new results of the qualitative analysis were NPL score 2.0 for chromosome 6 (2.1 reported previously) 1.8 for chromosome 10 (2.3) and 2.6 for chromosome 22 (2.5). The results are also summarized in Table 3. In conclusion, most of the chromosomal regions showing suggestive evidence of linkage in the original analyses also remained suggestive after statin correction, the regions being 4p12, 6p24, 6p12, 15q22 and 22q11.

Chromosome 6, which showed suggestive evidence of linkage in both quantitative and qualitative analysis, was analyzed further. The MAX-TREE statistic of the qualitative analysis resulted in an NPL score of 2.1, suggesting a dominant trait on 6p12, whereas NPL_ALL, which is most powerful at detecting linkage to an additive trait, yielded a somewhat higher result for 6p22 than for 6p12. When we included only overweight subjects (BMI>25) in the qualitative analysis for the low HDL-C trait, the NPL score on 6p22 was 1.8, whereas after the inclusion of BMI in the QTL analysis, the highest LOD scores were 2.7 (D6S309) and 2.1 (D6S507), for the region 6p24–22, illustrating that the linkage regions of the qualitative and quantitative analysis on chromosome 6 approached each other when BMI was taken into account.

The highest evidence of association in the quantitative association analysis was obtained with marker D6S1713 on 6p25, with a P-value of 0.03. The results of the association analysis with respect to chromosome 6 are summarized in Table 4.

Table 4 P-values for the quantitative association analysis

Discussion

A low HDL-C level has been found to be the most prevalent dyslipidemia associated with premature CHD.3 Given this close relationship, establishment of the genetic background to HDL regulation could illustrate the etiology and pathogenesis of CHD and provide new tools for its prevention and treatment. We undertook a whole-genome scan and detected QTLs for HDL-C regulation showing suggestive evidence of linkage on chromosomes 2q33, 4p12, 6p24 and 17q25. Three loci for the qualitative low HDL-C trait showing suggestive evidence of linkage were identified in the chromosomal regions 6p12, 10p15.3 and 22q11. The nearest functional candidate genes for the identified loci are shown in Table 3. Chromosome 2q33 (rs2943634) was significantly associated with CHD in the WTCCC Study and this finding was replicated in the German MI Family Study in a recently published genome-wide association analysis.52 Chromosomal locations 6p12-q1212 and 6p2210 have been linked to HDL-C levels, and an amino acid substitution in the endothelin (EDN1) gene located on 6p24.1 has previously been associated with HDL-C levels in a large analysis of 103 candidate genes for CHD and associated phenotypes in a founder population.53 Chromosome 10 has previously shown linkage to HDL-C and TG in other Finnish studies18, 54 and also for obesity.55 Chromosomal region 22q11-q13 has provided suggestive evidence of linkage to HDL-C in a genome-wide scan of serum lipid levels in the Old Order Amish56 and in an Australian sample.10

PPARD, located on 6p21.2-p21.1, is a nuclear transcription factor regulating lipid metabolism, and its agonists promote reverse cholesterol transport, partly by increasing ABCA1 transcription57 and have been shown to increase plasma HDL-C concentrations in insulin-resistant mice58 and rhesus monkeys.57 RXRB, on 6p21.3, is a transcription factor that forms heterodimers with oxysterol receptors (liver X receptors, (LXRs)) and upregulates ABCA1 and apoa-1-mediated cholesterol efflux.59 However, the genotyped PPARD and RXRB SNPs showed no statistically significant evidence of association with HDL levels in our sample. As quantitative traits are inherently more informative than disease–health dichotomies, we analyzed the data using the quantitative association test. Marker D6S1713 on 6p25 revealed suggestive evidence of association in the quantitative association analysis

The suggestive QTLs affecting HDL-C variance and suggestive loci for the low HDL-C trait in our sample were partly located on different chromosomes, suggesting that the general variability in HDL-C at the population level may be affected by other genes than those causing the lowest HDL-C levels. It is also important to acknowledge the differences between the two statistical methods and the loss of information because of dichotomizing the trait for the qualitative analysis. We recognize that by analyzing the data using multiple statistics and approaches, there is danger inherent in multiple unadjusted tests, but this concern is at least partly compensated for by the greater power to detect linkage with such a complex trait. Although no statistically significant evidence of linkage was observed, it was encouraging that we found suggestive evidence of linkage on seven chromosomal regions. By definition, suggestive evidence of linkage will be found by chance once in each genome scan, and therefore some of these hits should be real and not because of chance. To confirm the results and to examine the possibility of false-positive signals, these results will need to be replicated in another sample.

The genetic basis of HDL-C regulation seems to be largely heterogeneous. To reduce the heterogeneity and increase the power for finding potential rare variants, our sample was collected from a relatively isolated geographical region and contained large families with probands fulfilling criteria that narrowed the phenotype (the low HDL-C trait associated with early CHD). The linkages revealed in this study should therefore show evidence of loci affecting the clinically important atherogenic low HDL-C trait. We found evidence of linkage with seven chromosomal regions, the highest having an LOD score of 3.1. Two of the loci have previously shown significant evidence of linkage to HDL-C levels (6p12)12 and CHD (2q33).52 When taking into account the constant correction for the statin users, the strongest evidence of linkage in the quantitative analysis was detected on chromosomes 4p12 and 6p24, and in the qualitative analysis on chromosomes 6p12 and 22q11. In addition, in the quantitative analysis, chromosome 15q22 provided an LOD score exceeding the level of suggestive evidence of linkage only after the constant correction. This same region, including the gene for hepatic lipase (LIPC), has repeatedly shown linkage or association with HDL-C.16, 22, 23, 26, 27, 28, 29, 30, 31, 32 We cannot exclude the possibility that the linkage signals on chromosomes 17q25 and 10p15 could have been false positives originally, as they decreased more than the other cases, in which the changes in LOD/NPL scores because of statin correction were ±0.1–0.2. From the previously published genome scans for low-HDL-C loci (Tables 1a and b), the samples of the French Canadian9 and the Finnish studies15, 18, 25 were clinically the most similar to our sample. However, the French Canadian study did not provide information about the CHD status of the subjects and the other Finnish studies included families with familial combined hyperlipidemia (FCHL), making the lipid profile of that sample somewhat different from the current one. The two Finnish scans for low-HDL-C loci have shown significant evidence of linkage to chromosomal regions 8q2315 and 10q11,18 which do not overlap with our results. Chromosomal regions 2q21.1–22 and 2q31, which showed significant evidence of linkage to CHD60 and suggestive evidence of linkage to the TG trait54 in other Finnish scans, are closer to our finding. Altogether, 18 chromosomes have been described earlier as showing significant evidence for loci regulating HDL-C levels (Tables 1a and b). This may at least partly be because of some false-positive signals and the different selection criteria used to recruit the populations, as some samples have been selected on the grounds of CHD, familial hypercholesterolemia, diabetes or hypertension, whereas others have been assembled randomly. Because of the small number of original founders, the isolation of the Finnish population for the past centuries and genetic drift, the loci for HDL-C regulation in the homogeneous Finnish population, or even in regional subpopulations, might be specific and somewhat unique,61, 62 whereas they could be of minor importance in other, more heterogeneous, populations. Their identification would nevertheless provide important knowledge about HDL metabolism and the relationship between HDL and atherogenesis.